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Did platform feeds sow the seeds 
of deep divisions during the 2020 US presidential election? 


By Ekeoma E. Uzogara 


he advent of social media forever changed 
how we consume news. At least half of Ameri- 
cans rely on it for news, and Facebook (owned 
by parent company Meta) is the most popular. 
Meta’s Facebook and Instagram platforms are 
funded through advertisements and generate 
more revenue when users spend more time on 
their platforms. To make platforms alluring and 
increase screen time, tech companies operate 
on business models that incentivize algorithms that are 
designed to elevate eye-catching content to the top of us- 
ers’ feeds—content that captures attention and may go 
“viral” by stimulating “engagement” through comments, 
likes, and resharing. 

But can a business model that prioritizes “engagement 
algorithms” pose a threat to democracy? Algorithms glean 
information about users’ interests to give them more of 
what they want to see. This could lead to polarization 
when algorithms curate emotionally evocative content 
that features divisive or inflammatory language. Whether 
this process has consequences at voting booths remains 
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hotly contested. In a properly functioning democracy, 
people must form political beliefs from accurate news, 
but social media’s architecture may leave partisans siloed 
in biased news sources, exposed to misinformation, and 
surrounded by like-minded individuals that reinforce 
their attitudes. Proposed solutions include the suppres- 
sion of reshared content or replacement of “engagement 
algorithms” with others that indiscriminately recommend 
content in reverse chronological order. 

In this special issue, these possible controls are explored 
through a pioneering collaboration between indepen- 
dent academics and Meta’s researchers, who had access 
to Meta’s internal data during the 2020 US presidential 
election. Research herein examines pressing questions 
around the impact of Meta’s platforms on polarization, 
political knowledge, attitudes, and behavior and also 
discusses trade-offs during this rare collaboration. Tech 
companies have a public responsibility to understand 
how design features of platforms may affect users and, 
ultimately, democracy. The time is now to motivate sub- 
stantive changes and reforms. 10.1126/science.adj7023 
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Independence by permission 


Industry—-academy collaboration explores the 2020 US election 


By Michael W. Wagner 


n the spring of 2020, pressed by inter- 
nal research staff and with support 
from the highest levels of the company, 
Facebook (now Meta) began a collabora- 
tion with outside academics to study the 
2020 US presidential election. Interest 
in social media’s role in elections came after 
evidence that the 2016 US election was rife 
with suspicious and divisive groups target- 
ing swing-state residents on Facebook (J). As 
one of the initiatives the team used to dem- 
onstrate transparency, I was approached to 
serve as the project’s independent rappor- 
teur. I conclude that the team conducted rig- 
orous, carefully checked, transparent, ethi- 
cal, and path-breaking studies—reported by 
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Gonzélez-Bailon et al. (2) and Guess et al. (3, 
4) on pages 392, 398, and 404 of this issue, 
respectively, and by Nyhan e¢ al. (5)—in the 
first public results of this project. Though the 
work is trustworthy, I argue that the project 
is not a model for future industry-academy 
collaborations. The collaboration resulted in 
independent research, but it was indepen- 
dence by permission from Meta. 

The research used platform-based inter- 
ventions of consented participants to show 
that both altering what individuals saw in 
their newsfeeds on Facebook and Instagram 
and altering whether individuals encoun- 
tered reshared content affects what peo- 
ple saw on the platforms. However, these 
changes did not reduce polarization or im- 
prove political knowledge during the 2020 


US election. Indeed, removing reshared con- . 
tent reduced political knowledge. 

After 3 years, the outcomes of these exper- 
iments are the first publicly shared results 
from this US 2020 Election Project (hereaf- 
ter, “2020EP”). The project is also producing 
papers on topics such as whether Facebook 
enables ideological segregation, whether 
campaign advertising on platform influ- 
ences attitudes and behaviors, the effects of 
coordinated inauthentic behavior (i.e., disin- 
formation spreading), and a comprehensive 
description of Meta’s political communica- 
tion ecosystem. Several papers are working 
their way through the peer review process. 

The 2020EP is a partnership between 
Meta and 17 scholars working at universities 
across the US (hereafter, “outside academ- 


science.org SCIENCE 


SOCIAL MEDIA AND ELECTIONS 


SPECIAL SECTION 


ics”) (6). More than two dozen researchers, 
engineers, legal analysts, and others worked 
on the project for Meta. The research leads 
were Chad Kiewiet de Jonge, Annie Franco, 
and Winter Mason (7). The principal inves- 
tigators of the study were outside academ- 
ics Natalie (Talia) Jomini Stroud and Joshua 
Tucker. All of the 17 outside academics on 
the team had a prior connection with Social 
Science One, a project that had previously 
piloted some research partnerships, with at 
best mixed results, between academic re- 
searchers and Facebook. 

The team hoped my role would increase 
the confidence that the public and the schol- 
arly community would have in the project. 
My charge includes commenting upon the 
nature of the collaboration between Meta 
and the outside academics and whether the 
project might serve as a model for future 
industry-academy collaborations. I was not 
compensated by Meta or the outside aca- 
demic team. I knew some members of both 
teams before I began my work. Since June, 
2020, I have observed more than 350 virtual 
research meetings and 2 days of in-person 
research meetings, totaling more than 500 
hours. I conducted 41 interviews with Meta 
researchers and staff and members of the 
outside academic team. I also interviewed 
two former Meta employees, five major so- 
cial science research funders, and five aca- 
demic experts in the area who were not a 
part of the project. I have access to drafts 
of working papers and appendices, many 
of the conversations between the research 
team that took place over the web app 
Workplace, project code, drafts of project- 
governing documents, meeting agendas, 
some email correspondence, and observa- 
tions of team members at conferences and 
in-person work sessions. 

As I elaborate below, my conclusion is 
that for social science research about the ef- 
fects of social media platforms and their use 
to be truly independent, outside academ- 
ics must not be wholly reliant on the col- 
laborating industry for study funding, must 
have access to the raw data that animates 
analyses, must be able to learn from inter- 
nal platform sources about how the plat- 
forms operate, and must be able to guide 
the prioritization of workflow. Additionally, 
some project structures that were appropri- 
ate to US-based faculty are unlikely to apply 
to other parts of the world. 


THE BIRTH OF THE PROJECT 
Research about how industry and the acad- 
emy collaborate tends to focus on issues 
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of research productivity, patents acquired, 
economic value, future collaborations, and 
conflict-of-interest concerns. Behaviors 
most associated with successful collabora- 
tions include flexibility with institutional 
factors, honesty in relationships, clarity in 
outputs, and sharing an awareness of the 
frameworks of the collaboration (8). In 
their model for industry-academy part- 
nerships, King and Persily suggest Social 
Science One’s structure of outside fund- 
ing, platform-provided data, and an aca- 
demic council (9). However, Social Science 
One faced intense criticism for major de- 
lays from Facebook in data sharing, (the 
company cited privacy concerns) and for 
a major error in the data found long after 
Facebook provided it (10, 17). One funder in 
the social media space that I interviewed 
called Social Science One a “rope a dope,” 
arguing that Meta strings researchers along 
with a promise for data that never comes, 
or comes with untenable compromises. De- 
spite the middling track record of the Social 
Science One partnership, Meta researchers 
argued that though there were multiple 
potential pathways to pursue the 2020EP, 
selecting scholars with connections to So- 
cial Science One was efficient and expedi- 
ent. Scholars outside of the 2020EP project, 
including blind reviewers of articles in this 
issue, have criticized the project for its US- 
centric focus and for only including those 
with Social Science One connections, leav- 
ing out less well-resourced scholars. 

Meta researchers were keen to conduct 
rigorous research of the 2020 election with 
outside academics to add a critical layer of 
legitimacy to their work. Several Meta em- 
ployees said that it was personally and pro- 
fessionally important to them to “course 
correct” after the emotional contagion 
study and the Cambridge Analytica scan- 
dals revealed that millions of unconsented 
users had their data collected to manipu- 
late their emotions or expose them to po- 
litical advertising, respectively (12). Meta 
researchers wanted a study that asked 
important scientific questions, engaged in 
rigorous analysis, respected platform us- 
ers’ privacy, practiced consented research, 
and collaborated with respected scholars. 
In addition to the internal advocacy at 
Meta, Meta faced pressure from the acad- 
emy, the public, and lawmakers to share 
their knowledge, resources, and data. 

In early 2020, Meta approached Tucker 
and Stroud to serve as co-principal inves- 
tigators (PIs) for the project. The co-PIs 
selected the rest of the academic team with- 
out Meta’s input or approval, though they 
agreed to constrain themselves to the nearly 
100 North American-based scholars affili- 
ated with Social Science One. 


GUARDRAILS 

The outside academics and Meta research- 
ers generally agreed about what would make 
the project a success. They wanted to conduct 
rigorous, transparent, and ethical studies 
about important questions related to social 
media’s effects on the 2020 US presidential 
election. Consistent with research highlight- 
ing the importance of agreeing upon outputs, 
the team agreed to study four key outcomes 
across more than a dozen papers: political 
participation, political polarization, knowl- 
edge and misperceptions, and trust in US 
democratic institutions. The team agreed 
that each study should be preregistered, spec- 
ifying hypotheses, research designs, measure- 
ment strategies, how effect sizes would be 
interpreted, and reserving the right to con- 
duct additional analyses. 

Additionally, the team felt it was impor- 
tant that the studies be credible—and be 
perceived to be credible. Thus, the outside 
academics would forego compensation from 
Meta. The outside academics asked for, and 
received, “control rights” for the papers. This 
meant that in the event of a dispute between 
the outside academics and the Meta research- 
ers over how to interpret a finding (or if there 
was a dispute between outside academics co- 
authoring the same paper) or what to include 
in a research paper, the lead author—which 
would always be an outside academic— 
would have the final decision. The only ca- 
veat to this rule would be if Meta’s legal team 
determined there was a privacy issue or other 
legal constraint that might prevent publica- 
tion. To date, that has not occurred. 

The outside academic team’s PIs indicated 
on several occasions that if Meta refused to 
publish the findings of any study, they would 
walk away from the project. To address the 
potential that Meta’s leadership would learn 
of results prior to publication and seek to 
either curtail publication or create a pre- 
emptive public relations strategy, Meta re- 
searchers and outside academics agreed to 
not share any results outside of their research 
team until the papers had been through the 
peer review process. To my knowledge, this 
rule was violated once, at a private dinner 
in which a member of the outside academic 
team discussed the results of a paper with 
others. This led to a series of reiterations 
of the rule with the full team. Additional 
“guardrails” for the project included: 
¢ Privacy requirements would allow only 

Meta employees to handle raw data. 

Outside academics were permitted and 

encouraged to develop and check code, 

but could not have the data on their 
computers. 

¢ As much of the data as possible would be 
made available to the broader academic 
community for future scholarship on the 
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Inter-university Consortium for Political 
and Social Research (ICPSR) (73). 

¢ Participants whose individual-level 
data would be analyzed would provide 
informed consent prior to participating. 
This included more than 75,000 par- 
ticipants in experimental analyses and 
400,000 participants who took part in at 
least one of the six survey waves, from 
August 2020 to March 2021. 

e All coauthors would be credited. 


Although many people not connected 
to the project praised the guardrails, oth- 
ers argued that some of the choices made 
by the team—such as taking no payment 
from Meta—would prevent less well-re- 
sourced US academics and academics from 
many other countries from pursuing a col- 


laboration with Meta, potentially limiting 
the kinds of research questions asked and 
methods used to answer them. 


OBSERVATION OF COLLABORATION 

My observations of research team meet- 
ings for each of the individual papers and 
weekly planning meetings between the 
lead Meta researchers and the lead outside 
academic researchers revealed a deep com- 
mitment to high-level social science schol- 
arship. Conversations were substantive, 
detailed, generally collegial, and usually 
focused on scientific questions of research 
design, measurement, coding, data analy- 
sis, internal Meta policies, Meta’s data ar- 
chitectures, and quality control. 

Meta project lead and outside academic 
PI conversations also focused on how the 
work would be received by reporters, other 
academics, and the public, as well as legal 
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and privacy issues within Meta. The leads 
created a Roles & Responsibilities docu- 
ment (which was completed 2 years into 
the project) and developed a variety of new 
procedures on an ad hoc basis to address 
unforeseen circumstances. As members of 
both the Meta and outside academic teams 
regularly said, “we are building the plane 
while we are flying it.” 

One contribution of the project is the 
ongoing set of procedures the research 
team created, followed, and altered over 
time. A massive quality assurance initia- 
tive was launched in real time to check for 
errors. Rules relating to research designs, 
as well as data transparency and analy- 
sis, were developed in an on-going fash- 
ion for the first 2 years of the project. The 
research team felt the tension between a 


preference to codify their roles and respon- 
sibilities in advance and the considerable 
time pressure to design and preregister the 
studies before the election season. In some 
ways, this process hewed to best practices 
in industry-academy collaborations with 
respect to the clarity of outcomes. In other 
ways, it highlighted the importance of flex- 
ibility in industry-academy collaborations 
as many processes changed substantially 
over time. 

For example, the guardrail of not dis- 
cussing a paper’s findings until it had 
been through peer review has changed 
considerably. Early in the project, the 
guardrail meant “accepted for publica- 
tion.” Some team members suggested that 
“peer reviewed” could be redefined as be- 
ing achieved even if a paper was rejected 
for publication. In the spring of 2023, the 
team began moving toward a policy that 


researchers could share findings from the 
papers not yet published in limited set- 
tings before those papers completed any 
peer review because the project has taken 
so long. 

In general, the Meta team agreed to pur- 
sue the topics and research designs that 
outside academics wanted to pursue. The 
nature of the collaboration allowed the 
outside academic researchers to peer in- 
side Meta in a way that would have been 
unavailable to them without the industry 
collaboration. On occasion, this, and advice 
they received from Meta researchers, led 
the outside academics to reshape what it 
is they sought to study, or how they sought 
to study it, as they learned details about 
what data were collected and structured by 
Meta for analysis. For example, the outside 
academics were not pre- 
cisely aware of how Face- 
book groups were joined 
and, in some cases, left, by 
users. Coming to under- 
stand this led to changes in 
the working paper examin- 
ing behavioral polarization. 
Additionally, the academics 
learned about corporate 
cultural behaviors at Meta, 
on topics ranging from paid 
time off (Meta staff did not 
work, unlike many of the 
outside academics who 
regularly worked while on 
vacation) to the privacy 
reviews from Meta’s legal 
team that were necessary 
to pass before the papers 
could be published. Meta 
researchers also noted that 
some outside academics 
were not aware of what 
could be analyzed on platform. Although 
this process provided clear advantages 
for outside academics as they could ben- 
efit from institutional knowledge, one 
social media expert who was not on the 
team pointedly noted that, “what data is 
made available shapes what is asked and 
answered,” an issue for readers to actively 
consider when, for example, interpreting 
the degree to which the results reflect in- 
dependent user behavior as compared to 
Facebook and Instagram’s algorithms and 
other platform features affecting user be- 
havior. A former Meta employee warned, 
“Facebook researchers will answer every 
question they get from the professors hon- 
estly; they are ethical professionals. How- 
ever, they are also corporate employees. 
If the professors don’t ask the exact right 
question, Facebook staff won’t volunteer 
that they know how to get what you (out- 
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side academics) are really asking for.” 

After learning of the Social Science 
One data error in the fall of 2021 (10), the 
2020EP team began discussing and de- 
veloping major quality-control protocols, 
checking data and code with multiple sets 
of eyes across the team. This process took 
months to complete and persists to this 
day on some working papers. Although 
researchers across both groups are confi- 
dent that the data still contain errors, as 
all datasets do, more was done to mitigate 
consequential errors on this project than 
on any social science research project that 
any of them, or I, could name. 

Although Meta researchers always con- 
sulted and sought agreement with the 
outside academic PIs on project workflow 
choices, they structured the choices to 
prioritize the experimental analyses. This 
was generally efficient; Meta researchers 
created (and repeatedly revised) a Gantt 
chart to describe the proposed timeline of 
progress across the papers, informing the 
outside academics that it was faster to an- 
alyze the experiments as compared to the 
data gathering and analysis required for 
the other, nonexperimental papers. How- 
ever, some Meta researchers felt that the 
experimental analyses were less likely to 
produce large effect sizes on the outcomes 
of interest. Though speculative, this could 
mean that this first wave of published ar- 
ticles will get the most public attention— 
because they are the first papers released 
in the project—even though other papers 
may produce results that show larger sub- 
stantive effects. 

It is impossible to study everything in 
one project, and this project’s enormous 
scope played a considerable role in the 
length of time it is taking to complete pa- 
pers. However, two major topics that did 
not receive much attention were race in 
the 2020 elections and the ways in which 
Facebook and Instagram follower net- 
works affected various outcomes of inter- 
est. Some of the outside academic team 
members noted that the very nature of the 
team, which did not include scholars pri- 
marily known for studying race, explained 
the omission. Others argued that the issue 
was one of statistical power, whereas oth- 
ers noted that Meta does not collect racial 
identifiers of its users. Regarding studying 
the follower networks of the study par- 
ticipants to obtain a more comprehensive 
understanding of social media’s role in the 
election, Meta said no, citing privacy con- 
cerns. One outside academic said, “it was 
the only time I heard a hard no, and I un- 
derstood—privacy—but if you want to un- 
derstand how social media works, you need 
to study who follows who.” 
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A SUCCESS, BUT NOT A MODEL 

Perhaps the biggest problem with the 
2020EP is that it worked so well; the high 
quality of the scholarship may encourage 
interpretations that the project is a model 
for future social media company—academy 
collaborations. Researchers were permitted 
to study almost everything they asked to 
study. The outside academics had the final 
say regarding how findings were interpreted 
and presented in the papers; and though 
I have observed no reason to believe that 
this is likely to change, it is the case that as 
these articles approached publication, Meta 
sought to reopen policies and procedures 
related to control rights. Specifically, Meta 
researchers wanted to be able to expressly 
disagree with lead author interpretations of 
findings or other matters in articles they co- 
authored. Stroud and Tucker opposed this 
effort, noting that collaborators can remove 
their name from a paper if they have a fun- 
damental disagreement. 

The study contained extraordinary data 
and code checking processes—a system 
that caught errors and slowed down the 
project by several months. The study was 
conducted ethically. There was informed 
consent approved by multiple universi- 
ties and an independent institutional re- 
view board (IRB), albeit the IRB from the 
same company that conducted the survey 
research elements of the project (NORC; 
formerly known as the National Opinion 
Research Center); and the original outside 
IRB contacted to review the project de- 
clined to review it for what I believe are 
public relations reasons, as compared to 
research ethics concerns. There was rigor- 
ous peer review after detailed preregistra- 
tion. The studies produced new knowledge 
that was shared with the world without 
edit from Meta’s legal or corporate teams. 
Meta demonstrated a strong commitment 
to rigorous social scientific research about 
the ways in which Facebook and Insta- 
gram influence elections in the US, spend- 
ing more than $20 million and redirecting 
the time of dozens of employees to serve 
the project. Concerns that Meta would leak 
negative study results have not material- 
ized to date. 

But Meta set the agenda in ways that 
affected the overall independence of the 
researchers. Meta collaborated over work- 
flow choices with the outside academics, 
but framed these choices in ways that drove 
which studies you are reading about in this 
issue of Science. Moreover, the collabora- 
tion has taken several years and countless 
hours of time—limiting the ability of the 
outside academics to pursue other research 
projects that may have shaped important 
public and policy conversations. 


One shortcoming of industry-academy 
collaboration research models more gener- 
ally, which are reflected in these studies, is 
that they do not deeply engage with how 
complicated the data architecture and pro- 
gramming code are at corporations such as 
Meta. Simply put, researchers don’t know 
what they don’t know, and the incentives 
are not clear for industry partners to reveal 
everything they know about their platforms. 

In an era in which state and national 
lawmakers proffer a general rhetorical 
and increasingly policy-related antipathy 
toward higher education, independent 
scholarship is more important than ever. 
In the end, independence by permission 
is not independent at all. Rather, it is a 
sign of things to come in the academy: in- 
credible data and research opportunities 
offered to a select few researchers at the 
expense of true independence. Scholarship 
is not wholly independent when the data 
are held by for-profit corporations, nor is 
it independent when those same corpora- 
tions can limit the nature of what it stud- 
ied. Creative social media-academy-funder 
partnerships, or, more likely, government 
regulation and data-sharing requirements 
(e.g., the European Union’s Digital Services 
Act) that also provide privacy protections, 
as well as defined structures to encourage 
and protect industry-employed research- 
ers to collaborate, are necessary to foster 
opportunities for path-breaking, compre- 
hensive scholarship that does not require a 
social media platform’s permission. 
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Does Facebook enable ideological segregation in political news consumption? We analyzed exposure 
to news during the US 2020 election using aggregated data for 208 million US Facebook users. We 
compared the inventory of all political news that users could have seen in their feeds with the information 
that they saw (after algorithmic curation) and the information with which they engaged. We show that 

(i) ideological segregation is high and increases as we shift from potential exposure to actual exposure to 
engagement; (ii) there is an asymmetry between conservative and liberal audiences, with a substantial corner 
of the news ecosystem consumed exclusively by conservatives; and (iii) most misinformation, as identified 
by Meta’s Third-Party Fact-Checking Program, exists within this homogeneously conservative corner, which 
has no equivalent on the liberal side. Sources favored by conservative audiences were more prevalent on 
Facebook's news ecosystem than those favored by liberals. 


ocial media platforms have a large and 
growing role in shaping access to informa- 

tion, including information about politics 

(1). However, few studies systematically 

map this part of the information ecosystem, 

with minimal examination of within-platform at- 
tention and information consumption (2). There 
is virtually no research comparing what people 
could potentially see within platforms and what 
they actually see [with the exception of (3)]. In 
this work, we address these questions with data 
on users’ access and engagement to political 
news on Facebook during the US 2020 election. 
Herein, we examine the supply of and demand 
for political news by tracking (i) the inventory of 
all news content that users could have seen, (ii) 
the subset of content users actually saw on their 
feeds, and (iii) the smaller subset of content 
with which users engaged (via clicks, reactions, 
likes, reshares, or comments). The analysis of 
this “funnel of engagement”—from potential 
exposure to actual exposure to engagement— 
gives a comprehensive picture of the informa- 
tion environment on Facebook during the 2020 
election. Our analyses show that both algo- 
rithmic and social amplification play a part in 
increasing ideological segregation. Algorithmic 


amplification refers to data-driven automated 
processes that result in some content being more 
visible in users’ feeds; social amplification refers 
to choices made by users that also grant more 
visibility to specific content through sharing 
and reposting. We show that these processes 
operate asymmetrically across the US politi- 
cal “right” (conservatives or Republican Party) 
and the political “left” (liberals or Democratic 
Party), with the presence of much more homo- 
geneous news consumption on the right—a 
pattern that has no parallel on the left. 

There has been a vigorous debate about the 
role of the internet in shaping the information 
that people see. Generally, individuals align 
their attention with their interests, but this 
tendency does potentially pose a challenge 
for a democracy if different people see funda- 
mentally incompatible political information. 
Early research (4, 5) argued that personali- 
zation algorithms (filter bubbles) and social 
curation processes (echo chambers) increase 
the probability that people will surround them- 
selves with ideologically compatible information. 
Subsequent research has found that online news 
diets are still fairly diverse ideologically and 
not more segregated than offline news consump- 


( 


tion (6-13). Much of this research, however, 1 Chee 
on web-browsing data, and there is still very lala 
research on the news that people see within 
platforms such as Facebook and Twitter. The 
literature also suggests asymmetric news con- 
sumption: There is a substantial right-leaning 
bubble that attracts a much more homoge- 
neously conservative audience that does not 
have an equivalent on the left (7, 14), and there 

is a higher prevalence of unreliable content on 
the right relative to the left (15, 16). 

Key limitations in past work include a reliance 
on domain-level aggregations of exposure, thus 
missing curation effects at the news-story level. 
For example, both liberals and conservatives 
may read content from the Wall Street Journal 
domain, but if liberals only read news stories 
and conservatives only read opinion content, 
it would be misleading to consider the Wall 
Street Journal a meeting place for liberals and 
conservatives. Domain-level aggregations of ex- 
posure may thus understate the level of segre- 
gation in news consumption. 

Another important limitation of past work 
is a lack of data about individual exposure to 
content within platforms. One highly cited ex- 
ception is a paper that examined exposure to 
news content on Facebook throughout the fun- ‘ 
nel of engagement (3). This paper found that 
there are social and algorithmic effects directing 
ideologically compatible content to users at each 
stage of the funnel, but the algorithmic effects + 
are modest in size compared with social curation 
and individual choice. This study is now quite 
dated (the data are from 2014), it has problem- 
atic generalizability because of the peculiar sub- 
sample of Facebook users it examined (i.e., the 
small fraction of people who volunteered their 
partisan identity in their profile), and it only 
analyzed content posted by friends, whereas 
today Pages and Groups are also important 
providers of content within the platform (these < 
are public and private profiles created by orga- ‘ 
nizations and collectives). In section 4.2 of the 
supplementary materials (SM), we offer more 
details about these limitations and how we . 
overcame them. 

It is plausible that ideological concordance 
is a much stronger driver in following Pages 
and/or joining Groups than in selecting friends. 
That is, although it is likely that people choose 
politically oriented Pages and/or Groups based 
in part on ideological concordance, existing re- 
search suggests that ideological agreement plays 
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only a small role in how people choose friends 
(and presumably many Facebook friends are 
friends, in a sociological sense) (17, 18). Finally, 
little of this past research compares findings 
regarding news consumption across internet- 
based modalities (e.g., exposure through browsing 
the web versus seeing content on Facebook's Feed). 

To address these limitations, this paper an- 
swers the following six research questions (RQs): 
RQI-How ideologically segregated is news con- 
sumption on Facebook, and are those patterns 
of segregation symmetric on the right and left; 
RQ2-How does segregation vary with potential 
news consumption versus actual exposure ver- 
sus engagement; RQ3-How does segregation 
vary if the level of analysis is URLs rather than 
domains (thus capturing curation of content 
within domains); RQ4-How segregated is expo- 
sure on Facebook relative to the benchmark of 
browsing behavior (the predominant source of 
data in past research); RQ5-How segregated 
are the streams of content from the major path- 
ways to exposure on Facebook (friends, Pages, 
and Groups); and RQ6-How prevalent is expo- 
sure to unreliable content on the right relative 
to the left. 


Study overview 


This research is part of the US 2020 Facebook 
and Instagram Election Study, a collaborative 
effort between Meta and a team of external 
researchers that was initiated in early 2020 to 
design and produce transparent and reproduci- 
ble research on the political impact of Facebook 
and Instagram (SM section S1). The data in this 
paper draw from the set of NV ~ 208 million US- 
based adult active users whose political ideology 
can be measured and track all URLs classified as 
political news that were posted on the platform 
from 1 September 2020 to 1 February 2021. In 
other words, our results are based on the ac- 
tivity generated by the nearly full set of US-based 
users that are active on the platform, which was 
231 million users during our study period (SM 
section S2). 

To identify political news URLs, we used 
Facebook’s internal civic and news classifiers 
(SM section S3.4). We used URLs and domains 
as our units of observation to calculate segre- 
gation metrics at the story and source levels. 
We also analyzed the overall news ecosystem 
through the analysis of clustering in coexposure 
networks (where nodes are news stories or 
domains and edges encode the number of unique 
users exposed to a given pair; Fig. 1A). Our 
analyses focus on political news posted by friends, 
Pages, and Groups. These posts create the sup- 
ply of content we reference using the term 
“inventory” (Fig. 1B). This inventory is deter- 
mined by the underlying network that users 
build on the platform (i.e., choices to follow 
Pages, join Groups, and “friend” individuals). 
This network is itself the result of another 
curation process that is determined by users’ 
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preexisting networks and interests, algorithmic 
recommendations on whom to follow, and plat- 
form affordances (i.e., the existence of Pages 
and Groups; SM section $3.2). 

Facebook’s algorithm ranks content from the 
inventory and presents a selection in list form 
in users’ Feeds (SM section $3.3). Users can then 
choose to engage with that content by clicking, 
reacting, liking, commenting, or resharing. En- 
gagement on the platform generates data that 
are then fed back into the algorithmic curation 
of content for users’ feeds. This is one of the key 
characteristics of social media: Algorithmic and 
social curation processes are in constant feed- 
back. In our analyses, we only examined posts 
classified as political news that contain a URL 
(SM section $3.1), which amount to about 3% 


of all posts shared by US adult users and 3.9% 
of all content that US adult users saw on the 
platform during our study period. In other words, 
the algorithmic curation process that we analyzed 
here relies on many more signals than those 
generated by the content we analyzed (i.e., 
political domains and URLs). 

For each URL (and corresponding domain), 
we have measures of the potential, exposed, 
and engaged audience. The potential audience 
of a URL is the set of unique users that could 
have been exposed to that content, the exposed 
audience is the set of unique users that saw a 
post containing that URL on their Feed and 
the engaged audience is the set of unique users 
that clicked, reacted, liked, reshared, or com- 
mented on the post with the URL [note that 
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Fig. 1. Summary of data and levels of analysis. (A) Our analyses center on three different levels: domains 
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(to calculate segregation metrics at the news source level), URLs (to calculate segregation metrics at the news-story 
level), and coexposure networks, where nodes are domains or news stories and the edges encode the number 

of unique users exposed to a given pair. The analysis of coexposure networks allows us to determine whether there is 
notable clustering in exposure to sources and, within sources, in exposure to stories. The schematic examples in 
subpanels (a) to (c), for instance, do not show strong evidence of clustering, but the networks in subpanels (d) 
and (e) do. (B) Schematic representation of the funnel of engagement (from potential exposure to actual exposure to 
engagement). Numbers indicate the level of analysis implied in our RQ2: (i) Are news inventories ideologically 
segregated, (ii) does segregation increase after algorithmic curation, and (iii) does segregation increase based on 
what content users choose to engage with. (© and D) Number of news domains (categorized as untrustworthy 

or trustworthy) (C) and news URLs (rated false and not rated false by Meta’s 3PFC) (D) in our data, presented as 
daily counts. The increase in the number of unique news stories depicted in (D) between September and November 
is partly an artifact of the threshold that we impose on the analysis of URLs (for privacy reasons, we only 

analyze URLs that were shared more than 100 times by US-based users during the observation period). Yellow 
indicates the volume of news domains or news URLs that were rated untrustworthy or false, and blue indicates the 
volume of news domains or news URLs that were not rated untrustworthy or false. The red dashed lines highlight 
November 2 and January 6, which are the dates of the election and the US Capitol riot, respectively. 
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our measure is agnostic about the affect (as in 
subjective feeling or emotion) of the engagement]. 
News stories are classified as “false” ifthe URL was 
rated false by Meta’s Third-Party Fact-Checking 
Program (3PFC) as of 15 February 2021 (SM sec- 
tion $3.4). This measure of “misinformation” likely 
undercounts the total volume of false news cir- 
culating on the platform; but whereas specific 
false news stories may go undetected (i.e., only 
a tiny amount of inventory ever makes it to the 
3PFC), untrustworthy labels at the domain level, 
which are applied whenever a domain had two 
or more URLs rated false by the 3PFC, have bet- 
ter coverage and are more reliable (SM section 
$4.1). Our segmentation of audiences based on 
ideology relies on Facebook’s internal ideology 
classifier, which predicts the political ideology 


Segregation index 


oO 
iO 
iS 


Oo 
ran 
on 


Mean favorability score 


0.10 


Sep Oct Nov’ Dec Jan Feb 


= Domains += URLs 


of US adult active users (SM section S3.4). For 
every URL (and domain), we have the count of 
users who viewed the post who are predicted to 
be conservative and liberal. 


Data 


Our data are based on activity that includes 
the N ~ 208 million US-based users who are 
active on the platform and whose ideology 
can be predicted using Facebook’s classifier. 
To protect privacy, we do not have access to 
individual-level data: We only analyzed URLs 
that were shared (either privately and publicly) 
more than 100 times, and we only analyzed 
aggregate exposure and engagement metrics 
for US adult Facebook users in relation to the 
URLs that were shared (SM section S2.1). 
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In total, our data comprise aggregated expo- 
sure and engagement metrics for NV ~ 208 mil- 
lion US adult active users with an ideology 
score in relation to N ~ 35,000 unique domains 
and N ~ 640,000 unique URLs that were clas- 
sified as political news and were shared more 
than 100 times during our observation pe- 
riod. The top five most-viewed domains are, 
in decreasing order, cnn.com, dailywire.com, 
foxnews.com, nytimes.com, and nbcnews.com. 
However, the most-viewed news stories (URLs) 
are not necessarily published by the most-viewed 
domains. For instance, pjmedia.com published 
the most-viewed story during the observation 
period (the story, under the headline “Military 
ballots found in the trash in Pennsylvania— 
Most were Trump votes,” was viewed 113,272,405 
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Fig. 2. Segregation and audience polarization at the domain and URL levels. 
(A) The segregation score based on exposed audience and calculated according to 
Eq. 1 is consistently higher at the URL level, suggesting that there are information 
curation practices with news stories that get masked when aggregating the data 
at the domain level. (B and C) Segregation scores drawn from exposed audiences 
are higher than those based on potential audiences but lower than the scores from 
engaged audiences (the difference between potential and engaged audiences is only 
visible at the domain level). This suggests that algorithmic and social amplification 
both contribute to segregation (note that we use different y-axis ranges to facilitate 
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comparison across curves for domains and URLs). (D) The mean favorability scores 
(calculated according to Eq. 2, with -1 indicating a homogeneously liberal audience and 
1 a homogeneously conservative audience) suggest that, overall, audiences that are 
consuming political news have a right-leaning slant (i.e., all scores are above the zero line). 
(E and F) The density plots show that the distribution is substantially skewed toward 
the right, with more domains and URLs being favored by very conservative audiences. 
Vertical black lines mark the quantiles of the distributions; scores are calculated for each 
domain and URL according to Eq. 2. For (A) to (D), shaded regions indicate 95% 
confidence intervals for the time trend based on a local polynomial regression. 
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times), but the domain is ranked in the 51st po- 
sition (SM section S4.1). Figure 1C shows the time 
series of daily counts for domains, and Fig. 1D 
shows the daily counts for URLs. Only a small 
fraction of all domains is classified as “untrust- 
worthy”; the fraction of false news stories is 
barely perceptible because it is very small com- 
pared with the full volume of news stories. 


Measures 


Our analyses rely on two measures: the segre- 
gation index (which offers a summary statistic 
of the entire information environment) and 
the favorability scores (which are associated 
with individual domains and URLs and allow 
us to infer the ideological composition of their 
audiences). In particular, we define the segre- 
gation index on a given period ¢ as 


B= lok) Sern) 


neN 

where 7 € N indexes the domains and URLs in 
our data (with daily temporal resolution), C,, is 
the count of conservatives exposed to a par- 
ticular domain or URL n, L,, is the count of 
liberals, v,, is the total number of unique views 
for domain or URL n, and Cand F are the total 
number of conservatives and liberals, respec- 
tively. For the analyses presented here, we di- 
chotomized the ideology scores such that users 
with a score <0.35 are categorized as liberal, 
and those with a score 20.65 are categorized 
as conservative (see SM section S4.10 for alter- 
native operationalizations and robustness tests). 

This index follows (6) and is adapted from 
research on residential segregation. It captures 
the extent to which conservatives and liberals 
visit the same neighborhoods in the information 
ecosystem. The first term is the (visit-weighted) 
average exposure of conservatives to conserva- 
tives, and the second term is the average expo- 
sure of liberals to conservatives. As a point of 
comparison, Gentzkow and Shapiro (5) examined 
the browsing behavior of a representative panel 
of individuals and found a gap of exposure to 
conservatives between conservatives and lib- 
erals of 7.5 percentage points (60.6% for con- 
servatives and 53.1% for liberals). 

The favorability scores are defined as 

(Cn — Ln) 


SQ0n, = (0, Dn) (2) 


These scores, used also in prior work (9, 19), 
are assigned daily as well as for the entire period 
to each domain and URL. The score equals 1 when 
a given domain or URL7 has an audience formed 
exclusively by conservatives and -1 when it 
has an audience formed exclusively by liberals 
(O means that conservatives and liberals are 
equally likely to be in the audience of a domain or 
URL n). We define audience polarization as the 
extent to which the distribution of favorability 
scores is bimodal and far away from zero. 
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Fig. 3. Audiences of untrustworthy domains and false URLs. (A) The density plots show that most domains 
that are categorized “untrustworthy” are favored by audiences that are predominantly conservative. For potential, 
exposed, and engaged audiences, 71, 76, and 76% of untrustworthy domains have audiences that are conservative 
on average (favorability score > 0). (B) The audience of domains posted by Pages is the most conservative. 
(C) The data suggest a stark conservative-leaning slant for news stories labeled as “misinformation.” For potential, 
exposed, and engaged audiences, 97% of false URLs in each case have audiences that are conservative on 
average. (D) The sharing of news stories labeled as “false” is much more concentrated among conservative Pages 
than among Groups and users. Vertical black lines mark the quantiles of the distributions; scores are calculated 


for each domain and URL according to Eq. 2. 


We calculated the segregation index and the 
favorability scores for potential, exposed, and 
engaged audiences. We complemented these 
analyses with network-based measures of clus- 
tering in exposed audiences by examining the 
backbone of coexposure networks (20, 27). This 
network approach allowed us to detect segrega- 
tion in news exposure using only behavioral 
traces that do not take into account the ideology 
of audiences (SM section S6). As we show below, 
the clustering that emerges from the behavior 
of users is aligned with the ideology partition. 


Results 


Figure 2 provides answers to RQs 1 to 4. The 
segregation score based on exposed audience 
for domains fluctuates around 0.35 (i.e., the gap 
between the intersection of conservatives with 
conservatives versus liberals with conservatives 
is 35 percentage points). This is substantially 
higher than values found in prior research based 
on web-browsing behavior, which range from 
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0.02 to 0.1 (6, 9, 11). The score for URLs is 
substantially higher still, fluctuating roughly 
between 0.45 and 0.5, in line with a mechanism 
of ideological curation of content within domains. 
As we move down the funnel of engagement, 
ideological segregation increases. Segregation 
scores for potential audiences are far higher 
than what prior research has estimated for news 
consumption (online and offline), scores for ex- 
posed audiences are higher than those based on 
potential audiences, and, for domain-aggregated 
data, scores for engaged audiences are higher 
than those for exposed audiences (Fig. 2, B and 
C). This suggests that algorithmic and social am- 
plification are both contributing to increased 
segregation: As we move down the funnel of 
engagement (i.e., as the footprint of algorith- 
mic and social curation becomes more evident), 
liberal and conservative audiences become 
more isolated from each other. 

The mean favorability scores indicate that 
audiences that are consuming political news 
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on Facebook have, overall, a right-leaning slant 
(all scores are above the zero line; Fig. 2D). 
However, the mean score is a somewhat mis- 
leading summary statistic, given the full dis- 
tribution (Fig. 2, E and F). The density plots tell 
us that the favorability scores are substantially 
skewed toward the right: There are more do- 
mains and URLs being favored by very con- 
servative audiences. To address RQ4 in more 
depth, we ran supplementary analyses in which 
we compared segregation and favorability scores 
for the subset of Facebook users that consented 
to provide data of their web activity (SM sec- 
tion $2.2). With this group of users, we ran 
intraperson comparisons of exposure to news 
content on- and off-platform. These compari- 


sons reveal that the segregation score on Face- 
book is three times as high as a web-tracking 
benchmark for the same set of individuals (SM 
section S5). 

Figure 3 provides answers to RQs 5 and 6. 
Most sources of misinformation are favored by 
conservative audiences. The distribution of fa- 
vorability scores does not shift substantially as 
we move down the funnel of engagement (Fig. 3, 
A and OC), suggesting that algorithmic and so- 
cial amplification do not exacerbate the already 
existing audience segregation for misinforma- 
tion content. However, misinformation shared 
by Pages and Groups has audiences that are 
more homogeneous and completely concen- 
trated on the right (Fig. 3, B and D). 
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Fig. 4. Top communities in coexposure networks. (A and B) Mean favorability scores for the top three 
communities by views in the weekly coexposure networks for domains (A) and URLs (B). The graphic in the lower- 
left part of each panel shows the full network for week 1 collapsed to the communities (i.e., each node is a 
group of domains and URLs with higher audience overlap). The presence of clusters in these networks suggests 
selective exposure: The networks are clearly organized around large clusters formed by mostly liberal or mostly 
conservative audiences, regardless of the level of aggregation (domain or URL). These clusters also provide 
additional evidence of asymmetric ideological segregation, which is especially visible in the URL networks: 
Across the observation window, there is a sizable cluster of news stories consumed predominantly by very 
conservative audiences who are exposed to a higher percentage of unreliable news stories (color coded). 
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The SM contains additional analyses disag- 
gregating the data for Pages, Groups, and users; 
COVID-related content; high-political-interest 
and low-political-interest users (under different 
measures of political interest); and weekly 
aggregations. The picture that these addition- 
al analyses draw is that, overall, the patterns 
that we identify here are consistent under 
different operationalizations and data aggre- 
gations. These additional analyses reveal two 
further insights: (i) Pages and Groups contrib- 
ute much more to segregation and audience 
polarization than users (SM section $4.4), and 
Gi) users classified as having high political in- 
terest are twice as segregated as low-interest 
users (SM section S4.7). 

Figure 4 provides additional evidence that 
confirms the robustness of the ideological asym- 
metry in news consumption (RQ]) and the prev- 
alence of exposure to unreliable content on the 
right relative to the left (RQ6). The analysis of | 
coexposure networks (exposed-audience only) 
allows us to determine whether we observe evi- 
dence of segregation even if we know nothing 
about the ideology of users: The presence of 
clusters in the network is evidence of selective 
exposure (which, in this case, is shaped by users’ 
social networks and news-feed algorithms). 

We used a community detection approach 
to identify the presence of these clusters (20). 
Figure 4 plots the top three communities, 
sized in terms of views, for each weekly net- 
work. Each of these communities is formed by 
domains and URLs that have a higher overlap 
of audiences internally than externally (i.e., 
with domains and URLs in other communi- 
ties). We then calculated the mean favor- 
ability score for each community (ie., the average 
score for the domains and URLs classified in the 
same community). This revealed that the clus- 
tering that was identified based only on coex- 
posure behavior maps onto the two sides of the 
ideological continuum: There is clearly a cluster 
of domains and URLs that liberal users are 
coexposed to, and a cluster on the conservative 
side. However, we see again a major asymme- 
try between the left and right. News sources and 
stories consumed by conservative audiences 
depart more clearly from the zero line of cross- 
cutting exposure, which means that their au- 
diences are more homogeneously conservative 
and, therefore, more isolated. These outlets on 
the right also post a higher fraction of news stories 
(URLs) rated false by Meta’s 3PFC program (color 
coded in the figure), which means that conserva- 
tive audiences are more exposed to unreliable 
news. (See SM section S6 for more details on 
network construction, statistics, and figures.) 


Discussion 


Our analyses highlight that Facebook, as a 
social and informational setting, is substantially 
segregated ideologically—far more than previ- 
ous research on internet news consumption based 


5 of 7 


RESEARCH | 


SOCIAL MEDIA AND ELECTIONS 


on browsing behavior has found. But our analyses 
also show that individual preferences and platform 
affordances intertwine in a complex fashion. 
Ideological segregation manifests far more in 
content posted by Pages and Groups than in con- 
tent posted by friends. That Pages and Groups are 
associated with higher levels of ideological segre- 
gation suggests that the choice of which Pages 
to follow and which Groups to join is driven far 
more by ideological congruence than the choice 
of with whom to be friends, as discussed earlier. 
An important direction for further research is 
to understand how individuals discover and 
decide to follow Pages and join Groups. 

The research we report here has some limi- 
tations. First, following the convention of most 
of the literature, we focus on URLs and domains 
because this allows us to examine how Facebook 
enables access to content that is produced off- 
platform. However, this approach excludes 
platform content that may generate different 
segregation patterns. Second, our definition of 
misinformation at the URL level relies on third- 
party fact checkers and their limited coverage 
of URLs. And third, we rely on Facebook’s classi- 
fiers in part because they perform better than 
ad hoc alternatives, but testing the robustness of 
our results with other classifiers is an area that 
requires more research. Future research would 
ideally also incorporate non-URL content and 
expand the coverage of our misinformation met- 
ric by relying on human annotation of random 
samples or automated classification methods. 

More generally, future research should ana- 
lyze how the underlying social graph, and algo- 
rithmic recommendations on whom or what 
to follow, determine the inventory of content 
available on the platform. The algorithmic pro- 
motion of compatible content from this inven- 
tory is positively associated with an increase in 
the observed segregation as we move from po- 
tential to exposed audiences. Future research 
should consider whether segregation of poten- 
tial audiences increases over time as a long-term 
effect of this algorithmic promotion. We hypoth- 
esize that many of the patterns that charac- 
terize Facebook’s funnel of engagement also 
characterize other social media. 

Finally, it is essential to replicate these anal- 
yses for future elections and in other national 
contexts to evaluate to what extent these results 
are contingent on the particular moment and 
the national and/or cultural setting. This research 
and, more generally, the US 2020 Facebook and 
Instagram Election Study have created a set of 
innovative processes and conventions to improve 
research transparency and integrity (SM section 
S12) and research ethics (SM section $1.3), which 
can serve as a blueprint for future collaborations 
between academic and industry researchers. 
We offer a proof of concept of the feasibility of 
productive collaboration between industry and 
external researchers, which demonstrates the 
credible empirical insights that can be gained 


Gonzalez-Bailon et al., Science 381, 392-398 (2023) 


through such collaboration. Present regulatory 
efforts, such as the European Union’s Digital 
Services Act (DSA), include requirements that 
would enable academic researchers to access 
data from the largest internet platforms and 
thus should facilitate replication of our analy- 
ses. Our research has produced a machinery of 
processes and checks that can provide the foun- 
dation for such research infrastructure. Compared 
with the baseline created by past research (3, 22), 
we have greatly moved the dial toward greater 
transparency and accountability. 

Our results uncover the influence that two key 
affordances of Facebook—Pages and Groups— 
have in shaping the online information envi- 
ronment. Pages and Groups benefit from the 
easy reuse of content from established produ- 
cers of political news and provide a curation 
mechanism by which ideologically consistent 
content from a wide variety of sources can be 
redistributed. As a result of social curation, 
exposure to URLs is systematically more segre- 
gated than exposure to domains. In the 20th 
century, local news media oligopolies slanted 
news coverage toward the mainstream of local 
audiences (23). In the 21st century, content 
from a large range of accessible sources may 
be providing the raw material for ideologically 
homogeneous feeds—similar to those produced 
by Pages and Groups. These patterns also have 
important implications for future research in 
this area: A focus on domains rather than URLs 
will likely understate, perhaps substantially, 
the degree of segregation in news consumption 
online. 

Finally, our results uncover the clearly asym- 
metric nature of political news segregation on 
Facebook—the right side of the distributions 
for potential, actual, and engaged audiences 
looks robustly different from the left side. Thus, 
although there are homogeneously liberal and 
conservative domains and URLs, there are far 
more homogeneously conservative domains and 
URLs circulating on Facebook. This asymmetry 
is consistent with what has been found in other 
social media platforms (24-26). We also observe 
on the right a far larger share of the content 
labeled as false by Meta’s 3PFC. Overall, these 
patterns are part of a broader set of long- 
standing changes associated with the fracturing 
of the national news ecosystem, ranging from 
Fox News to talk radio, but they are also a 
manifestation of how Pages and Groups pro- 
vide a very powerful curation and dissemina- 
tion machine that is used especially effectively 
by sources with predominantly conservative 
audiences (14). 
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How do social media feed algorithms affect attitudes 
and behavior in an election campaign? 


Andrew M. Guess?*, Neil Malhotra“, Jennifer Pan®, Pablo Barbera’, Hunt Allcott®, Taylor Brown’, 
Adriana Crespo-Tenorio*, Drew Dimmery“®, Deen Freelon’, Matthew Gentzkow®, Sandra Gonzalez-Bailon®, 
Edward Kennedy”°, Young Mie Kim”, David Lazer”, Devra Moehler*, Brendan Nyhan’’, 

Carlos Velasco Rivera‘, Jaime Settle’*, Daniel Robert Thomas’, Emily Thorson”, Rebekah Tromble?°, 
Arjun Wilkins*, Magdalena Wojcieszak’”“°, Beixian Xiong*, Chad Kiewiet de Jonge’, Annie Franco’, 

Winter Mason‘, Natalie Jomini Stroud’®, Joshua A. Tucker?° 


We investigated the effects of Facebook’s and Instagram’s feed algorithms during the 2020 US election. 
We assigned a sample of consenting users to reverse-chronologically-ordered feeds instead of the 
default algorithms. Moving users out of algorithmic feeds substantially decreased the time they spent on 
the platforms and their activity. The chronological feed also affected exposure to content: The amount 
of political and untrustworthy content they saw increased on both platforms, the amount of content 
classified as uncivil or containing slur words they saw decreased on Facebook, and the amount of 
content from moderate friends and sources with ideologically mixed audiences they saw increased on 
Facebook. Despite these substantial changes in users’ on-platform experience, the chronological feed 


did not significantly alter levels of issue polarization, 
other key attitudes during the 3-month study period. 


hat are the effects of machine-learning 

algorithms used by social media compa- 

nies on elections and politics? The notion 

that such algorithms create political 

“filter bubbles” (7), foster polarization 
(2, 3), exacerbate existing social inequalities 
(4, 5), and enable the spread of disinforma- 
tion (6) has become rooted in the public con- 
sciousness. Because the algorithms used by 
social media companies are largely opaque to 
users, there are numerous conceptions or “folk 
theories” of how they work and disagreements 
about their effects (7, 8). Understanding how 
these systems influence key individual-level 
political outcomes is thus of paramount scien- 
tific and practical importance (9). 

In particular, feed-ranking systems—algorithms 
designed to optimize the order in which content 
is presented to a user—have received consider- 
able scrutiny from regulators and the public. 
But studying the effects of these feed algo- 
rithms is challenging. Even with direct access 
to proprietary code and data, it would be dif- 
ficult to characterize their impact (70) because 
such algorithms are personalized on the basis 
of many factors, such as a user’s past behavior 
and predictions derived from the actions of 
similar users, which leads to complex and poten- 
tially heterogeneous effects on content ranking. 


affective polarization, political knowledge, or 


Furthermore, a well-established counterfactual 
is not always obvious: Technology companies 
frequently conduct “A/B tests,” or randomized 
controlled experiments, to refine the user experi- 
ence, but these are typically designed to evaluate 
discrete elements such as the weight of individ- 
ual inputs or how information is displayed in a 
specific panel (71). Evaluating the total impact 
of a machine-learning feed-ranking system 
necessitates a well-understood alternative for 
comparison. 

With these challenges in mind, we examined 
the effect of specific feed algorithms on indi- 
viduals’ political attitudes and behaviors. We 
did so by conducting randomized controlled 
experiments within the context of the 2020 
US presidential election campaign on Facebook 
and Instagram, two major social media plat- 
forms with more than 3.5 billion combined 
monthly active users worldwide. We took the 
algorithmic ranking systems on Facebook and 
Instagram, as they existed in the fall of 2020, 
as the status quo (henceforth “Algorithmic Feed”). 
In arandomlly assigned treatment group, users’ 
feeds were ranked in reverse-chronological 
order (henceforth “Chronological Feed”). Although 
arranging items in chronological order technically 
constitutes an algorithm, it is not predictive in 
the way that prominently scrutinized machine- 


( 


learning systems, such as those used by st Chec 


ie 


media platforms, are. The simple chronoloy--—— 


ranking, in which the most recent item is pre- 
sented at the top of one’s feed, is both easy to 
implement—it was originally used on both 
Facebook and Instagram and continues to be 
an option on multiple social media platforms 
(72)—and matches an alternative currently being 
proposed in public debates by policy-makers 
and members of civil society (13). 


Theory and research questions 


Algorithmic effects have rarely been the main 
focus of quantitative, publicly available research 
on social media and politics as opposed to other 
domains (5, 12). Out of necessity, studies have 
tended to treat social media as a unified “bundle” 
of features—which includes not only algorithms 
but also repeated interactions with certain indi- 
viduals and the ability to reshare content, for 
example (J4-16)—or to manipulate exposure to 
specific sources within the platform (/7, 18). By 
replacing the status quo algorithmic ranking sys- 
tem, our experiment allowed us the opportunity 
to focus on the role of Facebook’s and Instagram’s 
algorithms in sorting and ranking content from 
the “inventory” generated by connections in 
users’ social networks and the posts and inter- 
actions that they produce. 

This focus permitted us to identify the par- 
ticular role of algorithmic ranking. Other ele- 
ments or affordances of the “bundle” of social 
media features could conceivably be more im- 
portant. Similarly, although the particular mix 
and ordering of content that a user sees may 
be an important factor in shaping political 
attitudes and behaviors, other determinants 
of this ranking besides the particular algorithm 
used—such as user choice or the composition 
of one’s network—could be relevant as well (19). 
To the extent that prior research has docu- 
mented social media effects on our outcomes 
of interest (74, 15), a contribution of this study 
is to determine whether these can be attributed 
to a key feature of modern social platforms: 
personalized algorithmic ranking systems. 

Because the impact of these systems is largely 
unknown, we began with a research question 
(RQ): How does the Chronological Feed affect 
the content people see? We formulated this ques- 
tion and subsequent hypotheses in terms of our 
randomly assigned treatment, in which partic- 
ipants’ feeds were ranked in reverse-chronological 
order instead of by the default algorithm. Aside 
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from its policy relevance and simplicity, our 
choice of the reverse-chronological baseline was 
intended to maximize the differences in users’ 
feed experiences and thus the likelihood that 
algorithmically driven changes in content and 
emphasis could shape politically relevant at- 
titudes and behaviors. Additionally, machine- 
learning feed algorithms are often thought of 
as a “black box” because the rules they follow are 
not transparent. They are also optimized for cer- 
tain outcomes such as user engagement, and they 
“learn” from user behavior and adapt over time. 
By contrast, a chronological feed does not ex- 
hibit any of these features because the same 
rule is applied to all users at all times (20). 

We focused on three primary hypotheses that 
represent important public concerns that have 
been theorized in existing research, tied to three 
groups of outcomes potentially affected by the 
altered mix of feed content. The first outcome 
that we studied is polarization (H1). Scholars 
have long been interested in the polarizing effects 
of social media (27). Social media feed algorithms 
could affect political polarization in at least 
two ways. First, if algorithms use past user behav- 
ior to inform output, then perspectives with 
which a user has engaged in the past may be 
prioritized in the future, potentially encouraging 
selective exposure to like-minded political views 
(1, 19, 22-24). Repeated exposure to like-minded 
and reinforcing content may foster greater ex- 
tremity on issue positions (17). Thus, we ex- 
pected the Chronological Feed to lessen issue 
polarization (Hla). Second, algorithms may 
deprioritize content from certain parts of a user’s 
network of connections. More specifically, prior 
work has argued that Facebook’s features may 
encourage partisan stereotyping (3, 25) and 
influence negative attitudes about outgroups 
(26). As a result, in the US partisan political con- 
text, we also expected the Chronological Feed 
to lessen affective polarization at the individ- 
ual level (Hb) (27). 

The second primary hypothesis relates to 
political knowledge (H2). Feed algorithms could 
have consequences for political knowledge be- 
cause news in today’s polarized society is often 
engaging, making news more likely to be en- 
countered both purposefully and incidentally 
(28-30). Although people use social media for 
purposes unrelated to politics (37), one-third of 
Facebook users consume news on the platform 
(32). We expected that moving users out of 
an Algorithmic Feed would reduce their time 
spent on the platform and their on-platform 
engagement, decreasing exposure to political 
information. Thus, we expected the Chrono- 
logical Feed treatment to decrease knowledge 
about the 2020 election campaign (H2a) and 
decrease recall of recent events covered in 
the news (H2b). 

Our third primary hypothesis relates to poli- 
tical participation (H3), which could be in- 
fluenced by feed algorithms in several ways. 
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Fig. 1. Comparison of user experience and behavior for Chronological and Algorithmic Feed conditions on 
Facebook and Instagram. (Al, B1, C1, and D1) Facebook. (A2, B2, C2, and D2) Instagram. Values are 
unweighted sample statistics. All differences are significant at the p < 0.005 level, except share of exposures from 
unconnected users on Instagram; confidence intervals are thus not shown. 
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First, online political engagement may decline 
simply because overall online engagement 
declines in the Chronological Feed. Second, 
political knowledge may empower citizens to 
participate in the political process (33, 34), and 
as knowledge declines, so too might participa- 
tion. Third, social media algorithms may lower 
the coordination and information costs of mobi- 
lization (ZI, 15, 35), so that removing such algo- 
rithms would diminish participation. Another 
possible mechanism is that chronological rank- 
ing might increase exposure to more cross- 
cutting perspectives relative to the Algorithmic 
Feed, which scholars have argued can gen- 
erate ambivalence and have a demobilizing 
effect (36). We hypothesized that the Chron- 
ological Feed would reduce both online and 
offline forms of political participation (H3a), 
including self-reported turnout in the 2020 
election (H3b) and on-platform political en- 
gagement (H3c). 

Beyond these three sets of predictions, we 
prespecified a series of secondary hypotheses 
pertaining to public concerns but for which we 
do not have clear theoretical expectations based 
on prior research [supplementary materials (SM) 
section 2.2]. The secondary hypotheses cover 
the effects of the Chronological Feed treatment 
on factual belief accuracy, online consumption 
of political news (using behavioral data), trust 
in traditional and social media, confidence in 
institutions, perceptions of polarization, engage- 
ment with partisan news sources, epistemic 
political efficacy, vote choice, belief in the in- 
tegrity of the election, and support for political 
violence. 


Design and results 


This study is the result of a collaboration be- 
tween Meta and academic researchers outside 
of Meta (details including funding indepen- 
dence and safeguards against selective report- 
ing of results are available in SM section S11). 
Because Meta did not financially compensate 
academics for their work, we acknowledge that 
the academic researchers selected as part of the 
research team were fortunate to benefit from the 
support of their universities and external fund- 
ing. Study participants were recruited through 
survey invitations placed on the top of their 
Facebook and Instagram feeds in August 2020. 
Participants were users residing in the United 
States who were at least 18 years of age and who 
provided informed consent. Users who consented 
to participate in the study on both Facebook and 
Instagram were more active than the average 
monthly active user (SM section S3.1). Users were 
invited to complete five surveys (beginning in 
late August, mid-September, mid-October, No- 
vember immediately after Election Day, and mid- 
December 2020), share their on-platform activity, 
and participate in passive tracking of off-platform 
internet activity. Participants were given the op- 
tion to withdraw from the study and/or with- 
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draw their data from the study up until the 
data were disconnected from any identifiers. 
Additional details on participant recruit- 
ment and consent are available in SM section 
89.3, and ethics are discussed in SM sections 
$1.2 and S12. 

Participants were randomly assigned to either 
a status quo Algorithmic Feed control condition 
in which no changes were made to their Face- 
book or Instagram feeds or to the Chronologi- 
cal Feed treatment condition, active from 
24 September to 23 December 2020, in which 
the most recent content (defined by the pub- 
lication date and time of original posts) appeared 
at the top of those feeds. Both groups received 
the same level of compensation for participating. 
Selection and placement of posts from con- 
nected accounts such as friends, Pages, and 
Groups were potentially affected; that of ad- 
vertisements were not. According to indepen- 
dent estimates, this implies that about 80% of 
the material presented to respondents on the 
platform was manipulated as part of the 
experiment (37). The large samples (Facebook: 
n = 23,391; Instagram: n = 21,373), comprising 
participants who completed the first two sur- 
veys and at least one of the subsequent three 
waves, allowed for adequate statistical power 
to detect small effects (for example, for affective 
polarization, we were powered to detect popu- 
lation average treatment effects with Cohen’s 
d = 0.032 or larger for both Facebook and 
Instagram). Among participants in the exper- 
imental sample, 19.5% did not complete any of 
our posttreatment survey waves. However, this 
attrition is not significantly different between 
treatment and control groups (Facebook, p = 
0.83; Instagram, p = 0.35). Additional informa- 
tion on these and other aspects of the study de- 
sign is available in SM section S1, materials and 
methods. The study design, measures, and anal- 
ysis were preregistered at Open Science Frame- 
work (OSF) before treatment assignment. 

As discussed in the preanalysis plan, our main 
estimand of interest is the population average 
treatment effect (PATE), which is weighted by 
users’ predicted ideology, friend count, number 
of political pages followed, and number of days 
active, among other variables (SM section S9.5). 
We also report the unweighted sample average 
treatment effect (SATE) among consenting par- 
ticipants for transparency. The unweighted sam- 
ple has a greater proportion than that of the 
weighted sample of those who are between the 
ages of 30 and 44, white, female, and liberal; 
have higher income; and have a college degree 
(detailed comparisons are available in SM sec- 
tion $3.1). Our PATE estimates were designed 
to facilitate inferences to the Facebook and 
Instagram populations assuming negligible 
treatment-effect heterogeneity by other char- 
acteristics. In general, and consistent with 
(38), we found limited effect heterogeneity (SM 
section $2.3). In case our weighting scheme did 


not fully account for the greater activity levels 
in the sample, our estimates correspond to ar- 
guably the most relevant subset of users: those 
who engage the most and generate a dispro- 
portionate share of content on the platform 
(SM section 82.1). 

We first characterized the impact of the Chro- 
nological Feed treatment on users’ on-platform 
experience (Fig. 1). Engagement, user satisfac- 
tion, and news originality are key signals used 
by the Facebook feed algorithm to determine 
how to rank content for individual users [more 
publicly available information is available in 
(20, 39, 40, 41)] (SM section S1.1). We found that 
users in the Chronological Feed group spent 
dramatically less time on Facebook and Insta- 
gram. Participants in this study spent more time 
on Facebook and Instagram than did the aver- 
age US monthly average user of both platforms 
(Fig. 1, Al and A2) (comparisons of sample with 
US monthly active users are provided in SM |. 
section $3.1). However, the average respondent 
in the Algorithmic Feed group spent 73% more 
time each day on average compared with US 
monthly active users, and for those in the Chro- 
nological Feed group, this value reduced to 37% 
more (p < 0.005). On Instagram, the average 
participants in the Algorithmic Feed group 
spent 107% more time, whereas those in the 
Chronological Feed group spent 84% more 
time, compared with US monthly active users 
(p < 0.005). We observed substitution to 
other social media platforms in the following 
ways: On mobile, the mean number of hours 
that Instagram users in the Chronological Feed 
group spent on TikTok and YouTube increased 
by 36% (2.19 hours) and 20% (5.63 hours) over 
the entire study period, respectively (p < 0.05); 
for Facebook users, time spent on Instagram in- 
creased 17% (1.24 hours) (p < 0.05). For browser 
usage, we detected no change among Instagram 
users, but the average number of visits by Face- 
book users to reddit.com and youtube.com in- 
creased in the Chronological Feed group by 
52% (16.2 visits, p < 0.005) and 21% (50.1 visits, 
—p < 0.05), respectively. 

On Facebook, users in the Algorithmic Feed 
group liked an average of 6.7% of the content 
to which they were exposed, whereas those in 
the Chronological Feed group liked 3.1% on 
average (Fig. 1, B1); the same pattern of lower 
engagement for Instagram is shown in Fig. 1, 
B2 (p < 0.005 for likes and comments on both 
platforms). The Chronological Feed decreased 
the relative share of content from friends by 
an average of 24 percentage points on Facebook 
and from mutual follows (Instagram relation- 
ships in which the account a user follows 
also follows the user’s account) by an average of 
5 percentage points on Instagram ( p < 0.005; 
Fig. 1, Cl and C2). The Chronological Feed de- 
creased the share of users’ networks of friends, 
Pages, and Groups from which they saw con- 
tent on Facebook (Fig. 1, D1); the daily share of 


8 of 7 


RESEARCH | SOCIAL MEDIA AND ELECTIONS 


content from users’ mutual follows also decreased 
on Instagram (p < 0.005 for all comparisons) 
(Fig. 1, D2). All descriptive statistics on time 
spent, engagement, exposure, network, and 
substitution are reported in SM section 82.1. 
We next turned to our RQ and examined how 
the chronological ranking affected the mix of 
content in users’ feeds. These effects mostly 
did not vary by prespecified subgroups, with the 
exception of one difference that we note below 
(all subgroup analyses are provided in SM sec- 
tion S2.3). As shown in Fig. 2, Chronological 
Feeds contained more content pertaining to 
politics on average than did Algorithmic Feeds, a 
difference also reflected in political news content 
on Facebook. The Chronological Feed treatment 
reduced the share of content from ideologically 
“cross-cutting” sources on Facebook (18.7 versus 
20.7%, p < 0.005) and also reduced the share of 
content from ideologically “like-minded” sources 
on Facebook (48.1 versus 53.7%, p < 0.005); de- 
tails on the classification of political ideology, 
which is not available on Instagram, are avail- 
able in SM section S7. An exploratory analysis 
suggests that reductions in both like-minded 
and cross-cutting content on Facebook are offset 
by increases in content from moderate friends 
and sources with ideologically mixed audiences 
(30.9% in Chronological Feed versus 22.6% in 
Algorithmic Feed, p < 0.005). Both the increase 
in this middle category and the decrease in 
posts from like-minded sources in the Chrono- 
logical Feed are larger in magnitude than the 
decrease in cross-cutting posts, which is arguably 
consistent with the Algorithmic Feed promot- 
ing an “echo chamber’ or “filter bubble” effect (1). 
The largest relative shifts were in the share 
of content from untrustworthy sources and con- 
tent classified as uncivil or containing slur words 
(more details on classification methods are 
provided in SM section S7). On Facebook, the 
Chronological Feed increased the share of con- 
tent from designated untrustworthy sources by 
more than two-thirds relative to the Algorithmic 
Feed (4.4: versus 2.6%, p < 0.005), whereas it re- 
duced exposure to uncivil content by almost 
half (1.8 versus 3.2%, p < 0.005). Our analysis of 
effect heterogeneity (SM section S2.3) reveals 
that adjusted for variable importance, the re- 
duction of incivility was greater among users 
with larger inventories, which is a proxy for 
network size. On Instagram, we did not find a 
meaningful change in the proportion of users’ 
feeds with uncivil content (1.6 versus 1.6%), but 
we did find an increase in content from untrust- 
worthy sources like that on Facebook, although 
smaller (1.6 versus 1.3%, p < 0.005). The last 
content category we analyzed was content with 
slur words, which was extremely rare to begin 
with on both platforms according to our clas- 
sification methods (0.03% on Facebook, 0.02% 
on Instagram) and decreased even further (by 
approximately one-half) in the Chronological 
Feed on Facebook (0.019%, p < 0.005). 
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We then turned to tests of our primary hy- 
potheses (Fig. 3, top). Across both Facebook 
and Instagram, respondents in the Chronolo- 
gical Feed condition did not express significantly 
lower levels of affective or issue polarization 
than of those in the Algorithmic Feed condition 
(p = 1in all cases, adjusted for multiple compa- 
risons), meaning that we did not find support 
for H1. Additionally, there were no statistically 
significant or substantive differences between 
the treatment conditions with respect to elec- 
tion knowledge or news knowledge on either 
platform (p > 0.63 for all estimates), meaning 
that we did not find support for H2. In all cases, 
we could rule out effect sizes smaller than those 
found in previous research (/4). 

Our third hypothesis concerns self-reported 
political participation. Our survey-based mea- 
sure of participation encompassed forms that 
occur online (such as signing an online petition), 
offline (such as attending a protest or rally), or 
both (such as contributing money to a political 
candidate). Estimates for the Facebook plat- 
form suggest an effect close to zero of the Chro- 
nological Feed on this index of self-reported 


political participation (p = 1.0). We also did not 
find a statistically significant difference in the 
proportion of users who reported voting in the 
2020 election between Algorithmic and Chrono- 
logical Feed groups on Facebook (p = 1.0). On 
Instagram, there was also no detectable effect on 
either self-reported political participation (p = 
1.0) or self-reported turnout (p = 0.64). How- 
ever, using on-platform measures of political 
engagement—for example, posting or liking con- 
tent classified as political, sharing that you voted, 
and mentioning politicians and candidates run- 
ning for office on social media—we found strong 
evidence that the Chronological Feed had a 
negative impact on both platforms (Facebook, 
-0.117 SD, p < 0.005; Instagram, —0.090 SD, 
p < 0.005). 

We briefly revisit the possible mechanisms 
for how feed algorithms could influence this 
online form of political participation. Our 
proposed mechanisms pertaining to knowledge , 
and cross-cutting perspectives could not directly 
be evaluated with our design because these po- 
tential mediators occurred post-treatment (42). 
We also could not specifically test our proposed 


@ Algorithmic feed @ Chronological feed 
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is political content 0.155 (+15.2%) 
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Prop. of feed from _| 
moderate/mixed sources 


RQIc: Prop. of feed that _| 
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Fig. 2. Estimated changes in prevalence of feed content on both Facebook and Instagram. (Left) 
Facebook. (Right) Instagram. Values are average unweighted proportions within each group, with percent changes 
relative to the Algorithmic Feed control group in parentheses. All differences are significant at the p < 0.005 
level, except RQIf for Instagram (p < 0.05); confidence intervals are thus not shown. RQ1b and RQIc were not 
tested for Instagram because political and ideology classifications are not available on that platform. Fully 
specified regression models with survey weights are reported in the SM, section S2.2. 
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Fig. 3. Population average treatment 
effects of the Chronological Feed, 
relative to the Algorithmic Feed 
control group, on both Facebook 
and Instagram. (Left) Facebook. 
(Right) Instagram. Estimates 

are presented in standard deviations 
with 95% confidence intervals (not 
adjusted for multiple comparisons). 
Partisan news clicks are estimated 

only for Facebook because source-level 
estimates of political ideology are not 
available for Instagram. pol., political; 


pres., presidential. 
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mechanism regarding coordination and infor- 
mation costs of mobilization, which we leave to 
future research. The remaining mechanism 
that we proposed was that on-platform parti- 
cipation would decline as a result of an overall 
drop in online engagement. In SM section S3.2 
(tables S82 to S85), we present evidence that 
decreases in engagement with political con- 
tent on both platforms are of similar magni- 
tude to decreases in engagement with all content. 
This is consistent with our analysis of hetero- 
geneous effects, which shows that the decrease in 
on-platform political engagement is largest among 
the most active Facebook and Instagram users 
(SM section S2.3 and figs. S34 and S49). 

Last, we tested our secondary hypotheses, 
which are reported in Fig. 3, bottom. Across 
nearly all outcomes, we observed no signifi- 
cant differences between the two feed conditions. 
The treatment does not have statistically dis- 
tinguishable effects on perceived accuracy of 
various factual claims, trust in media (either tra- 
ditional or social), confidence in political insti- 
tutions, perceptions of political polarization, 
epistemic political efficacy, belief in the legiti- 
macy of the 2020 election, or political violence. 
The one exception is that on the Facebook 
platform, users in the Chronological Feed con- 
dition clicked more frequently on political news 
content from likely partisan sources (0.107 SD, 
p < 0.01). In an analysis that was not pre- 
registered, we found that this was driven by an 


increase in on-platform exposure to news from 
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the same set of partisan sources (SM section 
$3.3). Although this may seem inconsistent 
with our finding that the treatment increased 
exposure to ideologically moderate or mixed 
sources, the apparent discrepancy lies in that 
these latter exposure measures were based on 
all types of content, whereas our preregistered 
news clicks measure focused on posts contain- 
ing political news links only. Given that past 
research has found that news sources whose 
audiences are politically homogeneous have 
lower reliability (43), one potential explana- 
tion for the difference is that ongoing efforts to 
down-rank low-quality sources in the algorith- 
mic feed may have reduced exposure to the 
kind of frequently posted partisan news links 
that would receive more visibility in a chrono- 
logical feed. 


Discussion 


We demonstrated that Facebook’s and Insta- 
gram’s feed-ranking algorithms in late 2020 
strongly influenced users’ experiences on social 
media. The Chronological Feed dramatically 
reduced the amount of time users spent on the 
platform, reduced how much users engaged 
with content when they were on the platform, 
and altered the mix of content they were 
served—for example, decreasing content from 
friends while increasing content from Pages 
and Groups on Facebook. Users saw more con- 
tent from ideologically moderate friends and 
sources with mixed audiences; more political 


content; more content from untrustworthy sour- 
ces; and less content classified as uncivil or con- 
taining slur words than they would have on the 
Algorithmic Feed. Users also participated less 
in online forms of political engagement. Beyond 
these platform-specific experiences, however, 
replacing existing machine-learning algorithms 
with reverse-chronological ordering of content 
did not cause detectable changes in downstream 
political attitudes, knowledge, or offline behavior, 
including survey-based measures of polariza- 
tion and political participation. These findings 
shed light on prior research as well as folk the- 
ories about the effects of social media algo- 
rithms on politics and elections (3). 

There are several possible explanations for 
the disconnect between the large changes in 
online behavior caused by our treatment and 
the few discernible changes in political atti- 
tudes, knowledge, and offline behaviors in our 
user sample. It is possible that such down- 
stream effects require a more sustained inter- 
vention period (44), although our approximately 
3-month study had a much longer duration than 
that of most experimental research in political 
communication. Our results may also have been 
different if this study were not run during a 
polarized election campaign when political con- 
versations were occurring at relatively higher 
frequencies, or if a different content-ranking 
system were used as an alternative to the sta- 
tus quo feed-ranking algorithms. Along these 
same lines, this study was run in a specific 
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political context (the United States), and the 
results may not generalize to other political 
systems. That being said, many of the features 
of the contemporary United States—such as 
increased polarization, the rise of populism, 
and the presence of online misinformation—are 
present in many other democracies. It is possible 
that the effects of algorithms could be more 
pronounced in settings with fewer institutional- 
ized protections (for example, a less-independent 
media or a weaker regulatory environment). 
Last, the change to the Chronological Feed af- 
fected many aspects of users’ experiences on 
Facebook, Instagram, and beyond—for example, 
decreasing time spent on the platforms, seeing 
more content from Groups and Pages rather 
than friends, seeing more and less of different 
types of content, and increasing time spent on 
other social media platforms. These factors may 
in turn have affected each other and have had 
differing effects on political attitudes, knowl- 
edge, and behaviors, so that in aggregate we 
did not observe discernible changes. 

An additional set of limitations concerns the 
nature of our design: Our research design pro- 
vides estimates of direct effects on individuals 
and as such cannot speak to whether social 
media shapes societal incentives or influences 
the behavior of other users. For example, if 
ranking algorithms affect the demand for cer- 
tain kinds of content, they could influence the 
decisions of content producers (such as news 
organizations), civic organizations, and politi- 
cal campaigns, which would feed back into the 
inputs of the algorithmic system itself. Even 
more simply, the Chronological Feed experiences 
that we studied were composed in part of posts 
shared by other users not in our sample whose 
behavior was shaped by the standard algo- 
rithmic rankings. Our design thus cannot speak 
to “general equilibrium” effects. If our random- 
ized intervention (Chronological Feed) were to 
be scaled up to the population of all users, its 
impact may differ because algorithmic feed- 
back within networks may generate complex- 
system dynamics. Future research can examine 
whether some of the strongest effects uncov- 
ered in this study—such as the decreased time 
spent on platform—mediate the relationship 
between the Chronological Feed and political 
attitudes and behaviors. 

Despite these limitations, the size of the 
experimental data we gathered is large. For the 
primary outcomes, our findings rule out even 
modest effects, tempering expectations that 
social media feed-ranking algorithms directly 
cause affective or issue polarization in indivi- 
duals or otherwise affect knowledge about po 
litical campaigns or offline political partici- 
pation. Although the dependent variables we 
studied are important for democratic health, 
they are not exhaustive. Future research should 
explore whether the kinds of ranking algo- 
rithms that we studied can produce substantial 
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effects on other outcomes, such as agenda 
setting or users’ own curation habits (9). 

To the extent that these findings suggest 
that social media algorithms may not be the 
root cause of phenomena such as increasing 
political polarization (44), it raises the stakes 
for finding out what other online factors—such 
as the incentives created by the advertising 
model of social media—or offline factors—such 
as long-term demographic changes, partisan 
media, rising inequality, or geographic sorting— 
may be driving changes that affect democratic 
processes and outcomes. 
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Reshares on social media amplify political news 
but do not detectably affect beliefs or opinions 


Andrew M. Guess?*, Neil Malhotra“, Jennifer Pan®, Pablo Barbera’, Hunt Allcott®, Taylor Brown’, 
Adriana Crespo-Tenorio*, Drew Dimmery“®, Deen Freelon’, Matthew Gentzkow®, Sandra Gonzalez-Bail6n®, 
Edward Kennedy”°, Young Mie Kim”, David Lazer”, Devra Moehler*, Brendan Nyhan’®, 

Carlos Velasco Rivera‘, Jaime Settle’*, Daniel Robert Thomas’, Emily Thorson”, Rebekah Tromble?°, 
Arjun Wilkins*, Magdalena Wojcieszak’”“°, Beixian Xiong*, Chad Kiewiet de Jonge‘, Annie Franco’, 

Winter Mason‘, Natalie Jomini Stroud’®, Joshua A. Tucker?° 


We studied the effects of exposure to reshared content on Facebook during the 2020 US election by assigning a 
random set of consenting, US-based users to feeds that did not contain any reshares over a 3-month period. 

We find that removing reshared content substantially decreases the amount of political news, including content from 
untrustworthy sources, to which users are exposed; decreases overall clicks and reactions; and reduces partisan news 
clicks. Further, we observe that removing reshared content produces clear decreases in news knowledge within 

the sample, although there is some uncertainty about how this would generalize to all users. Contrary to expectations, 
the treatment does not significantly affect political polarization or any measure of individual-level political attitudes. 


t has been more than 10 years since social 

media platforms began to adopt reposting 

as a core feature. More than one-fourth of 

posts that Facebook users see in their feeds 

have been “reshared” [see supplementary 
materials (SM) section S1.8]: Rather than orig- 
inating from a friend, a followed page, or a group 
they belong to, these posts appear in their feeds 
when they are reposted by someone to whom 
they are directly connected. For Facebook and 
other platforms that enable reshare function- 
ality, content rapidly attains popularity, or “goes 
viral,” because it is reshared (1-5), and removing 
reshared content has been suggested as a policy 
intervention to minimize harmful effects of vi- 
rality (6-8). However, the effects of such an in- 
tervention are unknown because most reshared 
content does not go viral. Given platforms’ im- 
portance as facilitators of public discourse, 
this key element of the social media experience 
could affect political attitudes and behaviors. 
Accordingly, we investigate the impact of re- 
moving reshared content from the feeds of 
consenting Facebook users on political polar- 
ization and political knowledge with a random- 
ized controlled experiment conducted within 
the context of the 2020 US presidential elec- 
tion campaign. 


Theory and research questions 


Prior research suggests that reshared content 
may increase political polarization and political 


knowledge. For example, experimental research 
has shown that deactivating Facebook causes a 
decrease in both polarization and knowledge 
(9), suggesting that these outcomes may be 
related. Resharing is a key feature of social plat- 
forms that could plausibly drive these effects. In 
particular, reshared posts with strong social 
endorsements seem more credible than non- 
reshared posts and may thus be more notice- 
able to users (0, 17). Users tend to reshare others’ 
content when it is emotionally activating (12), 
including politically tinged content that could 
feed polarized attitudes (13-16). 

Withholding reshared content from users’ feeds 
may thus reduce affective and issue polarization 
by decreasing their exposure to emotionally or 
ideologically inflammatory content (H1). The 
hypothesized mechanism is that reshares are 
likely to promote encounters with partisan cross- 
cutting sources, increasing the potential for 
engagement with polarizing content. Some ex- 
perimental research suggests that exposure to 
such content could increase issue polarization (17), 
though evidence is still mixed (/4). By contrast, 
reshared content may have a net benefit for pol- 
itical knowledge. Though reshares can increase 
exposure to misinformation and lower-quality 
content (e.g., clickbait, sharebait) (4, 5), they may 
also incidentally expose users to trustworthy in- 
formation they do not seek out, facilitating by- 
product learning about the election and events 
in the news (J8-22). Given that misinformation 


( 


is a small fraction of content on social platf¢ Chee 
(23, 24), we expect that removing reshared Least i 
tent would, on net, reduce accurate knowledge 
about the election campaign (H2). We additionally 
test a series of secondary hypotheses about the ef- 
fect of withholding reshared content on a range 
of outcomes including factual discernment and 


belief in the integrity of the 2020 election. 


Design and results 


As part of the US 2020 Facebook and Instagram 
Election Study (FIES), participants were recruited 
through survey invitations placed on the top of 
their Facebook feeds beginning in August 2020, 
which were seen by ~14.6 million users. Adult 
users residing in the United States who provided 
informed consent (N = 193,880, 1.3% of those 
who saw the invitation) were invited to complete 
five surveys and share their Facebook activity (see 
SM section S11 for study timeline and CONSORT 
flow diagram). Participants were given the op- 
tion to withdraw from the study and/or withdraw 
their data from the study up until the data were 
disconnected from any identifiers in February 
2023 (see SM section S8.2 for additional details 
on recruitment and consent, and sections S1.2 
and S12 for ethical considerations). Participants 
who completed baseline surveys (N = 75,189, ‘ 
0.5% of those who saw the invitation) were ran- 
domized into eight experimental conditions 
associated with the larger FIES study. These 
participants were separately invited to share * 
data on their desktop and mobile web visits 
(N = 7730 consented and shared data, or 0.05% 
of those who saw the initial study invitation). 

In this article, we analyze data from two of the 
experimental conditions: (i) a control condition in 
which no changes were made to their Facebook 
feeds, and (ii) a treatment condition in which no 
reshared content (from friends, Groups, or Pages) 
was shown in the feed (for simplicity, we refer 
to this as the No Reshares group). Both groups «+ 
were compensated equally for participation. ‘ 
The experimental intervention was active from 
24 September to 23 December 2020. Our samples 
(N = 23,402 for survey data and N = 3781 for . 
passive tracking data), comprising participants 
who completed the first two surveys and at least 
one of the subsequent three waves, allow for ade- 
quate statistical power to detect small effects (e.g., 
for affective polarization, we are powered to de- 
tect population average treatment effects with 
Cohen’s d = 0.032 or larger and sample average 
treatment effects d = 0.023 or larger). Although 
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technical constraints meant that respondents in 
the No Reshares group saw a small amount of 
reshared content (see S18 for more details), re- 
spondents in the No Reshares group saw sub- 
stantially less reshared content than those in the 
control group: 28% of views in the control group 
were of reshared content, compared with 5.8% 
in the No Reshares treatment group (p < 0.01). 
Among participants in the experimental sam- 
ple, 19.5% did not complete any of our posttreat- 
ment survey waves. However, this attrition is not 
significantly different between treatment and 
control groups (p = 0.75). We also did not observe 
any differential attrition by survey wave across the 
study period (see S1.9). For additional information 
on these and other aspects of the study design, see 
SM section S1, materials and methods. The study 
design, measures, and analysis were preregistered 
at the Open Science Framework (OSF) prior to 
treatment assignment. As discussed in the preanal- 
ysis plan, our main estimand of interest is the pop- 
ulation average treatment effect (PATE), which is 
weighted by users’ predicted ideology, friend 
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count, number of political pages followed, and 
number of days active, among other variables (see 
SM section S8.5). We also report the unweighted 
sample average treatment effect (SATE) among 
consenting participants for transparency. The 
unweighted sample is more active on the plat- 
form, more likely to be between the ages of 30 and 
44, more white, more female, more liberal, higher 
income, and more likely to have a college degree 
than the weighted sample (see SM section $3.3 
for detailed comparisons). Our PATE estimates 
are designed to facilitate inferences to the popu- 
lation assuming negligible treatment effect het- 
erogeneity by other characteristics. In general, and 
consistent with (25), we find limited effect heter- 
ogeneity (see $2.3). In case our weighting scheme 
does not fully account for the greater activity lev- 
els in the sample, our estimates correspond to 
arguably the most relevant subset of users—those 
who engage the most and generate a dispropor- 
tionate share of content on the platform (see $2.1). 

Removing reshared content substantially al- 
tered how participants used Facebook and the 


content they saw (Fig. 1). Participants in this 
study spent more time on Facebook than the 
average US monthly average user (see SM sec- 
tion $3.3 for details), but the control group 
spent 73% more time each day on average com- 
pared to US monthly active users, whereas time 
spent reduced to 64% more for those in the No 
Reshares group (p < 0.005, Fig. 1A; all reported 
p-values are from covariate-adjusted ordinary 
least squares regression models with HC2 robust 
standard errors). We do not observe significant 
substitution to other social media platforms (see 
SM section $2.1). Users clicked on 7.5% of content 
they saw in the No Reshares group as opposed to 
8.3% in the control group (p = 0.04, Fig. IB). The 
proportion of posts seen by respondents to which 
they added any reaction (love, sadness, anger, 
surprise, or caring) also decreased (p < 0.01), but 
there were no significant changes in the rate of 
likes, comments, or users’ own reshares (see 
Table S25 in SM section $2.1 for more details). 
This means that removing reshared content 

does not change users’ own reshare behavior, 


B @Withreshares @Noreshares 
Click Like React Comment Reshare 
P) 0.08 5 
Z 
3 
S 0.064 
fo) 
ro 
°o 
a 
w 0.045 
. 
£ 
oO 
® 0.02 | 
§ 0. 
irs] 
0.005 0.004 
0.00 5 Ee aaa 


1.00 5 


O57 
Friends 


0.75 4 


0.50 4 


Source of exposures 


0.25 5 


0.004 


Friends 


0.204 


=) 

pe 

a 
1 


0.13 
0.10 + 


Friends 


oS 

jo) 

oO 
1 


Avg. daily prop. of users' network 
with content viewed by user 


0.004 


With reshares 


1 
No reshares 


0.23 
Groups 


0.1 
Friends 


With reshares 


1 
No reshares 


Fig. 1. Comparison of user experience and behavior for the No Reshares treatment and control conditions. (A to D) Values are unweighted sample statistics 
(N = 23,402). All differences are significant at the p < 0.005 level except for panel B Click rate (p < 0.05), Like rate (p > 0.05), Comment rate (p > 0.05), 
and Reshare rate (p > 0.05); confidence intervals are thus not shown. 
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which aligns with our understanding that most 
reshared content does not go viral: Resharing 
content typically does not create subsequent 
chains and cascades of resharing activity that 
result in virality. The No Reshares treatment 
decreased the relative proportion of content 
seen by participants that is posted by their 
friends by an average of 10 percentage points 
while increasing the relative share of content 
from Groups by 8 percentage points and from 
Pages by 2 percentage points (p < 0.01 for all 
comparisons, Fig. 1C). The No Reshares group 
decreased the proportion of users’ networks of 
friends and Pages from which they saw content, 
but the share from users’ network of Groups in- 
creased (p < 0.01 for all comparisons, Fig. 1D). 
Users in the No Reshares treatment group 
saw a different mix of content in their feeds, with 
the largest changes occurring in political news, 
untrustworthy sources, and content classified 
as uncivil, as well as political content more gen- 
erally (see S7 for more details on classification 
methods). As shown in Fig. 2, feeds without 
reshares contained less content pertaining to po- 
litics on average compared to the control group 
(10.8 versus 13.5%, p < 0.005), a difference that 
was especially pronounced for political news 
content, which decreased by more than half, on 
average (2.5 versus 6.2%, p < 0.005). Suppressing 
reshares cut the share of content from untrust- 
worthy sources by nearly a third relative to the 
control group (1.8 versus 2.6%, p < 0.005), where- 
as it increased exposure to posts classified as 
uncivil by more than 6% (3.4 versus 3.2%, p < 
0.005). Overall, however, the levels of exposure 
to both content from untrustworthy sources and 
uncivil posts were relatively low at baseline. We 
also classified content from other users, Pages, 
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and Groups in participants’ feeds as ideologically 
“like-minded” or “cross-cutting” (see SM section 
S1, materials and methods). The No Reshares 
condition decreased the proportion of both 
like-minded (51.1 versus 53.7%, p < 0.005) and 
cross-cutting (19.7 versus 20.7%, p < 0.005) 
content, while increasing that of ideologically 
moderate content by more than 15% (26.2 versus 
22.6%, p < 0.005). Our analysis of treatment 
effect heterogeneity reveals that the decreases 
in political and political news content were 
greatest among those with the highest pre- 
treatment levels of Facebook activity, but effects 
mostly did not vary by other prespecified sub- 
groups (see $2.3 for all subgroup analyses). Fi- 
nally, exposure to speech with slur words, which 
was extremely rare at baseline, was not signif- 
icantly affected by our treatment. 

Turning to tests of our primary hypotheses 
(Table 1, top section), we find that respondents 
without reshared content in their feeds did not 
express significantly lower levels of affective or 
issue polarization than those in the control group 
[population average treatment effect: false discov- 
ery rate (FDR) adjusted p > 0.8 in both cases], 
meaning that we do not find support for H1. 
There was also no statistically distinguishable 
change in election knowledge, i.e., users were less 
likely to correctly remember recent events (p > 
0.8). However, focusing on the sample average 
treatment effect suggests that removing reshares 
significantly reduced news knowledge, i.e., users 
were less likely to correctly remember recent events 
(p < 0.01). When estimating the population 
average treatment effect to generalize beyond 
the consenting participant sample to the pop- 
ulation of Facebook users, the estimate for news 
knowledge falls short of our preregistered thresh- 
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old for statistical significance when applying our 
correction for multiple testing (p = 0.16). Accord- 
ing to our primary estimand and prespecified 
threshold, H2 is also not supported. Yet, the 
effect sizes are fairly similar between the sam- 
ple average treatment effect and the population 
average treatment effect. Hence, removing re- 
shared content produces clear decreases in news 
knowledge within the sample, although there is 
some uncertainty about how this would gen- 
eralize to all users (see S3.3 for comparisons 
between the weighted and unweighted samples). 

In terms of secondary hypotheses (Table 1, 
bottom section), we observe no differences be- 
tween the treatment and control conditions that 
survive our preregistered correction for multi- 
ple comparisons (26) across nearly all outcomes 
(see SM section S1.11). The treatment does not 
have statistically distinguishable effects on per- 
ceived accuracy of various factual claims, trust 
in media (either traditional or social), confidence 
in political institutions, perceptions of political 
polarization, political efficacy, belief in the legi- 
timacy of the election, or support for political 
violence. We also do not detect differences in 
the consumption of political news off platform 
as measured with behavioral web visit data; 
although this may be surprising given the shifts 
in content that we observe, it is consistent with 
the low overall proportion of political news in 
people’s feeds. The one notable exception is that 
users in the No Reshares condition clicked less 
frequently on political news content from likely 
partisan sources (population average treatment 
effect: -0.109 SD, p < 0.01). 

For both the primary and secondary hypothe- 
ses, there are two results for the population 
average treatment effect that individually fall 


Fig. 2. Estimated changes in 
prevalence of feed content on 
Facebook. Values are average 
unweighted proportions within 
each group, with percent changes 
relative to the control group in 
parentheses (N = 23,402). All 
differences are significant at the 

p < 0.005 level, except RQIf 

(p > 0.05); confidence intervals 
are thus not shown. Fully specified 
regression models with survey weights 
are reported in SM section $2.2. 
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Table 1. Population average treatment effects (PATE) and sample average treatment effects (SATE) of the No Reshares group (relative to the con- 
trol group) for primary hypotheses (top) and secondary hypotheses (bottom). Estimates are presented in standard deviations (SD), and p-values are 
presented with and without sharpened FDR adjustment (26). Outcomes marked with * were computed using passive tracking data subsample (N = 3,781; see 
SM section S9). Outcomes marked with t were computed using Facebook platform data (N = 23,402; see SM section S6). All other outcomes were computed 
using survey data (N = 23,402; see SM section S1.5). 


Hypothesis 


Hla: Affective polarization 


SH3c: Partisan news visits* 


below the threshold for statistical significance 
but cross above the 0.05 preregistered thresh- 
old when multiple comparisons adjustments 
are implemented. In addition to having lower 
news knowledge on average, participants in the 
No Reshares group were less likely to be able 
to discern false claims. These two outcomes 
are related and substantively important, which 
prompted us to conduct non-preregistered, ex- 
ploratory analyses to probe potential mecha- 
nisms that may explain these results (see S3). 
The evidence that we explore suggests that sev- 
eral mechanisms are unlikely to play a role. Spe- 
cifically, we do not find that the No Reshares 
treatment increased exposure to Groups in which 
users frequently share untrustworthy informa- 
tion or decreased exposure to Facebook products 
aimed at increasing general knowledge (e.g., 
Voter Hub views); and, as noted previously, it did 
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PATE -0.109 


PATE -0.049 


-0.022, 0.014 


-0.029, 0.024] 


not increase exposure to like-minded sources. 
We do see a 62% decrease in exposure to main- 
stream news sources in the No Reshares group, 
which may have contributed to these results. 
Given the exploratory nature of these analyses, 
caution is warranted in interpreting these ad- 
ditional results and possible explanations. 


Discussion 


Content that goes viral on Facebook does so pri- 
marily through the mechanism of reshares. But 
the kinds of cascades typically associated with 
viral memes and emotionally engaging stories 
are rare (27). More often, reshares simply promote 
greater exposure to content from Facebook 
friends (relative to Groups) in a user’s network 
and increase overall engagement with posts (in 
the form of clicks and reactions). By removing 
reshared posts from users’ feeds, our experi- 


0.025 


0.052 


ment reveals the average impact ofreshareson . 


the mix of content that they see. The two con- 
tent types we measured that changed the most— 
political news and posts from untrustworthy 
sources—suggest that the reshares feature could 
be a double-edged sword: It facilitates encoun- 
ters with both reliable news about politics and 
current events but, to a somewhat lesser degree, 
also content from untrustworthy sources that 
may exaggerate or fabricate information. 
However, despite these changes, we were not 
able to reliably detect shifts in users’ political 
attitudes or behaviors, with the exception of a 
decrease in news knowledge within the sample. 
Owing to the large size of our experiment, we can 
rule out even moderately sized effects. This dis- 
connect is illustrated by the one outcome among 
our secondary hypotheses on which we estimate 
a statistically distinguishable effect of the No 
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Reshares treatment: a decrease in partisan news 
clicks. Our finding indicates that without reshares 
in their feeds, users were less likely to click on 
outbound links from news sources with highly 
ideological audiences. Yet this clear example of 
reshares-driven on-platform behavior does not 
manifest in our survey measures of issue or af- 
fective polarization. We conclude that though re- 
shares may have been a powerful mechanism 
for directing users’ attention and behavior on 
Facebook during the 2020 election campaign, 
they had limited impact on politically relevant 
attitudes and offline behaviors. 

Although our study provides rigorous evidence 
of the effects of reshared content on political out- 
comes, our conclusions are limited to the period 
in which we conducted our experiment—relatively 
late in terms of user adoption and in the midst 


of a politically divisive period in American history. 
Finally, our design cannot speak to “general equi- 
librium” effects, because doing so would imply 
making inferences about societal impact—for 
example, how the demand for certain kinds of 
content, and consequently, the incentives of con- 
tent producers might change—were our random- 
ized intervention scaled up to the population of 
all users. Nonetheless, our findings lay the 
groundwork for future research to investigate 
these nuances. 
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EDITORIAL 


Face value 


hen TheFacebook.com went live in Febru- 
ary of 2004, social media quickly became an 
integral part of life. Old friends rediscovered 
each other, pet photography flourished, and 
the way human attention on the internet was 
distributed was altered forever. As Ben Smith 
describes in his recent book, Traffic, before 
Facebook (now Meta) became widely adopted, aggrega- 
tors like The Huffington Post and the Drudge Report 
were the main drivers of traffic to websites. Now they 
have been supplanted by social media platforms. This has 
transformed the internet business, politics, and journal- 
ism. Along the way, Facebook has accumulated a treasure 
trove of social science data on human reading habits and 
engagement. In this issue, Science is 
publishing three research papers and 
a Policy Forum that arose from an in- 
novative partnership between Meta 
and academic researchers and illu- 
minate our understanding of Ameri- 
can political behavior. 

Facebook and its companion web- 
site, Instagram, use a machine-intel- 
ligent algorithm to feed content to its 
users rather than providing a simple, 
chronological feed of posts from 
friends. The algorithm increases 
time spent on the site, engagement 
with posts, and consequently, reve- 
nue from advertisers. As described in the Policy Forum by 
Michael Wagner (a professor of journalism and mass com- 
munication at the University of Wisconsin-Madison), the 
collaboration between Meta and academic researchers 
was designed to answer a number of questions, includ- 
ing whether Facebook’s algorithm and related practices 
altered political attitudes during the 2020 US presiden- 
tial election. Although Meta clearly had an incentive to 
study these matters after receiving criticism for what was 
perceived as its outsized role in the 2016 election, Wagner 
describes how guardrails were put in place to ensure that 
it was the academic researchers who made final decisions 
about most matters. However, only Meta researchers had 
access to the raw data, academic researchers had access 
to only project-related code, and all inquiries were passed 
to Meta for consideration. 

Wagner, who was the project’s independent rappor- 
teur, was tasked with monitoring the effectiveness of 
the guidelines. He told Science’s podcast host Sarah 
Crespi, “The guardrail that gives academics control 
rights over the research designs and the interpretation 
of results is one that really builds...confidence in what 


* «Facebook has 
accumulated 


a treasure trove 
of social 
science data...” 


it is that happened here. But it’s the case that Meta set 
the agenda in a variety of ways that affected how inde- 
pendent the researchers were.” 

Although the nature of the collaboration poses substan- 
tial challenges to publishing the results, Science decided 
that its potential to answer important questions makes 
publication worthwhile. Anonymized data and analysis 
code from the studies will be archived in the Social Media 
Archive at the Inter-university Consortium for Political 
and Social Research (part of the University of Michigan 
Institute for Social Research) and will be available for 
research that has been approved by an institutional re- 
view board on elections or to validate the findings of the 
studies. Last year, Science published a paper from a simi- 
lar collaboration between LinkedIn 
(the business and employment- 
focused social media platform) and 
academic researchers that addressed 
the power of weak social ties. Be- 
cause the findings were favorable 
for LinkedIn, some doubted whether 
the study was objective, but Science 
was satisfied with the independence 
of the finding and availability of the 
study’s data. Similarly, we have de- 
cided that the guidelines established 
by the researchers of the Facebook 
papers were adequate. “The scholar- 
ship is of high quality, it’s transpar- 
ent, and people are going to be able to replicate analyses,” 
Wagner said. “You can be as confident as possible about... 
the claims that these papers make.” Still, Wagner does not 
believe that the partnership is a model for future collabo- 
rations, because the academic researchers did not handle 
the raw data, nor did they set the workflow agenda. 

As for the findings, the studies conclude that changing 
to a chronological feed or eliminating reshared posts de- 
creased the amount of time people spent on Facebook’s 
website and affected the amount of misinformation 
that was transmitted through the platform. But perhaps 
more surprisingly, neither of these effects produced a 
discernible difference in the political attitudes of the us- 
ers. The explanation may lie in the data showing that 
the news fed to liberals by the engagement algorithms 
was very different from that given to conservatives, 
which was more politically homogeneous. Thus—as this 
issue’s cover illustration suggests—Facebook may have 
already done such an effective job of getting users ad- 
dicted to feeds that satisfy their desires that they are 
already segregated beyond alteration. 

-H. Holden Thorp 


H. Holden Thorp 
Editor-in-Chief, 
Science journals. 
hthorp@aaas.org: 
@hholdenthorp 
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IN BRIEF 


Edited by Jeffrey Brainard 


ASTRONOMY 


Dust clouds seen collapsing into planets 


esearchers say they’ve glimpsed distant planets 
forming from collapsing clouds of dust around a 
young star, providing some of the first evidence 
for a theory of planet formation. Theorists be- 
lieve planets are born either from the bottom- 
up, when dust aggregates and gravitationally 
attracts more material, or top-down, as dust clouds col- 
lapse under gravity. Gaps in disks of dust around young 
stars, presumably created as protoplanets vacuum up 
material, provide evidence for the bottom-up scenario. 


The top-down phenomenon has been harder to spot. 
But a team that used Europe’s Very Large Telescope to 
study the star V960 Mon, 5000 light-years from Earth, 
found it was surrounded not by a disk, but spiral arms 
wider than the Solar System (above, in a composite im- 
age recorded at visible and radio wavelengths). Looking | 
again using a radio telescope, they saw that sections 
of the arms were collapsing into clumps with planet- 
size masses, the research team reports this week in The 
Astrophysical Journal. 


U.S. agency bars Wuhan lab funding 


covib-19 | The Chinese lab at the center of 
the debate about the origin of the COVID-19 
pandemic was suspended last week from 


receiving funding from the U.S. Department 


of Health and Human Services (HHS). 
From 2014 to 2019, the Wuhan Institute 
of Virology (WIV) obtained money from 
the National Institutes of Health, an HHS 
branch, through a subcontract to a U.S. 
nonprofit, the EcoHealth Alliance. HHS 
faulted WIV for not sharing laboratory 
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notebooks and electronic files related to 
a mouse experiment that engineered a bat 
coronavirus and may have made it more 
dangerous to humans. HHS’s suspension 
order carries more symbolic than practical 
impact and includes a proposal to perma- 
nently bar the lab from future grants or 
contracts. HHS does not allege that the 
mouse experiments created SARS-CoV-2, 
but suspects they violated biosafety condi- 
tions stipulated with that grant. Critics 

of HHS say it bowed to political pressure 
and the action undermines relations with 


Chinese scientists who could help prevent 
future pandemics. 


Trinity test sent fallout far 


NUCLEAR WEAPONS | The July 1945 Trinity 
test, which detonated the first atomic 

bomb in a New Mexico desert, spread highly 
radioactive fallout farther than previously 
thought, a study has found. Researchers 
used newly discovered, historical weather 
data and modern atmospheric modeling to 
reconstruct the dispersal for the first time. 
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In a preprint posted last week on the arXiv 
server, they estimated that low levels of 
fallout reached 46 states within 10 days, 
and that, near the Trinity site, levels were 
as high as those in a wider area surround- 
ing the Nevada site where the government 
conducted 93 aboveground nuclear tests 
during the 1950s. Under a 1990 law, the 
government has compensated people 

who may have developed cancer or other 
diseases from exposure to the Nevada site’s 
fallout. But people living near the Trinity 
blast have not received those payments 
because, unlike at the Nevada site, exposure 
levels were never estimated. 


Head of U.S. pandemic unit named 


LEADERSHIP | President Joe Biden’s adminis- 
tration last week appointed Paul Friedrichs, 
a surgeon and retired Air Force major gen- 
eral, as the inaugural director of the White 
House Office of Pandemic Preparedness and 
Response Policy. The unit is responsible for 
coordinating and implementing plans to 
respond to biological threats and pathogens 
that could lead to a pandemic. It replaces a 
response team focused on COVID-19. Public 
health specialists have warned that despite 
COVID-19, the world remains under- 
prepared for future pandemics. 


Carbon fuels summer heat waves 


CLIMATE CHANGE | As the hottest sum- 
mer in modern history continues, with 
heat waves in parts of China, Europe, and 
North America all topping 45°C, a new 
analysis confirms what climate scientists 
already suspected: These extremes would 
have been “virtually impossible” to reach 
without society burning fossil fuels and 
emitting greenhouse gases. The heat waves 
have been driven primarily by weather 
fluctuations, such as kinks in the jet stream 
that allow heat domes to build up under 
lingering, cloud-free, high-pressure sys- 
tems. But the heat waves would have been 
as much as 2.5°C cooler in a preindustrial 
climate, World Weather Attribution, an aca- 
demic research consortium, said this week. 
This month is nearly certain to be the 
hottest on record, forecasters say, and as an 
El Nifio warming trend in the Pacific Ocean 
continues to raise temperatures, odds are 
growing that the whole of this year will set 
a new record for surface heat. 


Fewer doubts in Science papers 


PUBLISHING | Describing uncertainty in 
findings and conclusions is considered a 
hallmark of carefully prepared scientific 

papers, but the frequency of such hedges 
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CONSERVATION 


Four kakapGs released on 
NeWZealand’s mainland 
are harbingers of a 
restoration effort. 


Iconic parrot returns to mainland New Zealand 


he world’s heftiest parrot, which nearly went extinct a few decades ago, has taken 

a significant step toward recovery. Last week, conservation biologists transferred 

four male kakapos (Strigops habroptilus), which typically weigh about 3 kilograms 

each, from an offshore breeding colony to a wildlife reserve on New Zealand’s North 

Island, marking the first time in 40 years that the endangered species has lived on 
the mainland. The flightless birds were no match for hunters or invasive predators like 
stoats and rats, and their numbers plummeted to a few dozen by 1995. Biologists moved 
the survivors to three uninhabited, predator-free islands and have been breeding them 
to minimize loss of genetic diversity. To speed timely artificial insemination, workers 
sent drones bearing kakapo sperm flying across the island; their efforts helped boost 
the population to 252 in 2022. The four males were relocated to Sanctuary Mountain, a 
fenced, 3400-hectare wildlife reserve, where researchers will study how well they adapt. 


has fallen over the past 2 decades—perhaps 
because of pressure to publish, a study 
asserts. In one of the largest studies of its 
kind, a research team looked for words 
conveying doubt and uncertainty—such 

as “might,” “could,” “appear to,” “probably,” 
“approximately,” and “seem’—in research 
articles published from 1997 to 2021 in 
Science, which the team chose because it 
publishes articles from multiple disciplines. 
The frequency of hedges dropped by close to 
half, from 115.8 instances per 10,000 words 
in 1997 to 67.42 per 10,000 words in 2021. 
The finding is consistent with other recent 
research that found an increase in recent 
decades in the use of positive language that 
may be associated with forceful, possibly 
exaggerated, claims. The analysis appears in 
the August issue of Scientometrics. 


Hydrogen hunter gets big funding 


RENEWABLE ENERGY | A new company pros- 
pecting for underground stores of hydrogen 
has received $91 million in investment 
funding, a record for the field. Denver-based 
company Koloma, which came out of stealth 
mode last week in a profile in Forbes, is 


getting an undisclosed portion of its fund- 
ing from Bill Gates-backed Breakthrough 
Energy Ventures. Currently, all hydrogen 
used in industry is manufactured, but 

some scientists have concluded, contrary to 
conventional wisdom, that Earth naturally 
produces and traps vast stores of the gas, 
which burns without greenhouse emis- 
sions. Co-founded by Ohio State University 
geochemist Tom Darrah, Koloma has turned 
to remote sensing and artificial intelligence 
to identify promising deposits. The company 
says it is actively searching in the United 
States but declines to say where. 


BY THE NUMBERS 


2% 


Share of U.S. survey respondents 
who said returning astronauts to 
the Moon should be one of 
NASA's top priorities. Sixty percent 
chose searching for asteroids 
that could hit Earth; 50%, monitoring 
Earth’s climate. (Pew Research Center) 
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Origin of 


diamond-bearing 
eruptions revealed 


Deep mantle waves from continental rifting 
trigger mysterious kimberlite volcanoes 


By Paul Voosen 


orged under extreme temperatures 

and pressures more than 150 kilo- 

meters down in the mantle, diamonds 

ride rockets to reach Earth’s surface: 

narrow pipes of magma called kim- 

berlite that can erupt at the speed 
of sound. Strangely, most kimberlite pipes 
are found in the quiet, ancient interiors of 
continents. They are far from where most 
other eruptions occur: at the edges of tec- 
tonic plates and near mantle plumes, broad 
upwellings that form volcanic hot spots 
such as Hawaii or Yellowstone. “How 
on Earth did these get here?” asks 
Thomas Gernon, a geologist at the 
University of Southampton. “It was 
an elephant in the room that no one 
had a good explanation for.” 

Now, Gernon and his colleagues 
believe they do. They say the tim- 
ing and location of these diamond- 
bearing eruptions suggest they are 
aftereffects of the breakup of super- 
continents, which causes whirling 
turbulence in the viscous mantle 
rock below. Like slow-motion tidal 
waves of rock, the researchers say, 
these swells ripple beneath the 
continents, traveling hundreds of 
kilometers over the course of mil- 
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lions of years—and occasionally triggering 
kimberlite eruptions. “Kimberlites seem 
to be responding to rhythms of super- 
continents,’ Gernon says. 

The finding, published this week in Na- 
ture, is about more than diamonds and 
kimberlites, says Folarin Kolawole, a struc- 
tural geologist at Columbia University who 
is unaffiliated with the study. It suggests 
that tectonic action near Earth’s surface 
can influence the behavior of the mantle on 
a broader scale than once thought. It also 
indicates that the underground waves keep 
the margins of newly divided continents 


= 


Achunk of kimberlite from Kimberley, South Africa, carries a diamond, 
a hitchhiker from more than 150 kilometers down in the mantle. 


volcanically active far longer than expected, 
possibly explaining other volcanic rocks 
that had previously been chalked up to man- 
tle plumes. “This gives us a way forward— 
a hypothesis to test,” Kolawole says. 

Named after a mine in Kimberley, South 
Africa, that produced most of the world’s 
diamonds in the late 1800s, kimberlite 
forms when mantle rocks melt into a dense 
magma rich in magnesium, water, and 
carbon dioxide. The CO, and water create 
bubbles of gas that drive a narrow plug of 
magma toward the surface like the cork 
popping from a bottle of champagne. 

At depth, the pipes are just a few . 
meters across, but they expand into 
carrot-shaped cones when they burst 
out, leaving craters hundreds of me- 
ters across. “It’s going to be quite cat- 
astrophic when it reaches the Earth’s 
surface” says Emma Humphreys- 
Williams, a geochemist at the Natu- 
ral History Museum in London. 
Along the way, the magma picks up 
hitchhikers from the upper mantle 
and continental crust, including dia- 
monds, that are rich in clues to the 
behavior of the deep Earth. “There’s 
a whole element of Earth history that 
would be lost if we didn’t have kim- 
berlites,” says Kelly Russell at the Uni- 
versity of British Columbia. 
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The Udachny kimberlite pipe in 
Siberia is one of the world’s 
largest open pit diamond mines. 


Although Earth is dotted with thousands 
of known kimberlite pipes, not one has been 
known to erupt in recorded human history. 
The vast majority are hundreds of millions of 
years old. “All of us volcanologists would be 
willing to chip in and have a new eruption of 
kimberlites,” Russell says. 

During COVID-19 lockdowns, Gernon 
found himself returning to the ques- 
tion of why these eruptions only occur in 
cratons—the old, cold interiors of continents. 
Although some researchers believed mantle 
plumes could be the source, many kimberlite 
deposits don’t match known hot spot tracks. 
In addition, geochemical analysis of isotopes 
in mantle plume rocks suggest they originate 
in the lower mantle, whereas recent analyses 
of kimberlites point to a shallower origin. 

Gernon and his co-authors noticed how 
the timing of kimberlite eruptions did seem 
to match up with landmark events in plate 
tectonics. They reconstructed the move- 
ment of the continental plates over the past 
500 million years, comparing the rate of con- 
tinental rifting with the bursts of kimberlite 
formation. The analysis showed kimber- 
lite eruptions seemed to peak, on average, 
26 million years after a continental breakup. 
“That was curious and jaw dropping,’ 
Gernon says. 

They then zoomed in on the geologic his- 
tory of kimberlite deposits in southern Africa 


SCIENCE science.org 


and South America, which formed after the 
breakup of the Gondwana supercontinent 
120 million years ago, and those that 
formed in North America after the earlier 
crack up of Pangaea. The kimberlite volca- 
noes popped off progressively farther from 
the rift over time, the clusters shifting some 
20 kilometers every million years. 

Gernon and his colleagues think they 
know what drives this migration. As conti- 
nents split apart, hot mantle rocks well up to 
fill the gap. But they cool off and sink as they 
rub up against the cold continental sides of 
the gap, creating whirling convective pat- 
terns. Computer modeling shows these vor- 
tices travel along the keels of the continents, 
stripping their mushy roots and creating a 
rock mix perfect for melting into kimberlite. 
The simulations show the waves crawling 
along at a pace that matches the propaga- 
tion of kimberlites: about one-millionth of 
a snail’s pace. “It’s a really clever idea,” says 
Jeroen van Hunen, a geodynamicist at Dur- 
ham University. “It makes perfect sense.” 

It’s also a big claim, as the powerful 
waves would strip some 40 kilometers of 
rock from the base of continents. But some 
other evidence seems to support it. For 
example, not only do kimberlites migrate 
out from rifts over time, but their mix of 
isotopes also shifts, from patterns that re- 
semble a mantle-crust mix where the wave 
first breaks to a more uniform upper man- 
tle composition as the wave dies out. And 
in the Kaapvaal craton of southern Africa, 
for example, the continent saw several kilo- 
meters of uplift around the same time as the 
kimberlite eruptions. The uplift suggests the 
wave was underfoot at the time, stripping off 
the continent’s undercarriage and allowing it 
to rise like a hot air balloon shedding its bal- 
last. “Combined, this evidence is really com- 
pelling,” Gernon says. 

It’s unlikely the team has found the sin- 
gle cause for kimberlites, given how noisy 
the data are, and how much Earth’s quirks 
vary from place to place, says Karen Smit, a 
geochemist at the University of the Witwa- 
tersrand. “It’s a model that makes sense. I 
just don’t know if that correlation exists glob- 
ally” But Kolawole says the study is likely to 
prompt a surge of work in regions such as 
the Gulf of California or the Red Sea, where 
incipient rifting might be creating the deep 
waves, which seismic observations could 
reveal. The theory might also explain some 
volcanic deposits that were previously attrib- 
uted to plumes, he adds. 

The greatest interest in the study may come 
from commercial diamond miners, Gernon 
says. In theory, it could help predict the 
location of undiscovered kimberlites, 
he says. “You should be able to pinpoint, 
roughly, the sweet spot for diamonds.” & 


ETHICS 


Famously 
creepy museum 
reckons with 

its past 


Miitter Museum launches 
an ethical review of its 


anatomical curiosities— 
and sets off a firestorm 


By Rodrigo Pérez Ortega 


any regular visitors to the Miitter 

Museum in Philadelphia have 

their favorite specimen. There’s the 

megacolon—a 2.4-meter-long brown 

organ, the result of Hirschsprung 

disease in a 29-year-old man who 
performed at a freak show as Balloon Man 
and died in 1892 with 18 kilograms of poop 
in his bowels. There’s the Soap Lady, whose 
remains are covered with adipocere, or 
“corpse wax,’ a fatty substance that forms 
in warm, alkaline, and airless environments. 
And there are the skeletons of Carol Orzel 
and Harry Eastlack, two Philadelphians who 
lived with fibrodysplasia ossificans progres- 
siva, in which connective tissue slowly turns 
into bone. 

For decades, the 35,000 objects and speci- 
mens in the storied museum, which just 
completed a $3.2 million renovation of its 
storage and lab facilities, have attracted le- 
gions of fans, including some members of 
the disability community who see themselves 
represented in its exhibits. But in February, 
members noticed that most of the museum’s 
images and videos had disappeared from its 
website and YouTube channel without ex- 
planation, although the objects themselves 
remained on display. After a public outcry, 
the leadership of the museum and the Col- 
lege of Physicians of Philadelphia, which 
owns the Miittter, revealed the reason: They 
have launched a thorough ethical review of 
the museum’s handling and display of its 
6500 specimens of human remains. 

Although turnoffs for some, the Miitter’s 
skeletons, skulls, and body parts, displayed 
in a 19th century cabinet of curiosities style, 
are a source of knowledge as well as fascina- 
tion. Bioarchaeologist Molly Zuckerman still 
remembers her first visit to the Miitter, when 
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The Udachny kimberlite pipe in 
Siberia is one of the world’s 
largest open pit diamond mines. 


Although Earth is dotted with thousands 
of known kimberlite pipes, not one has been 
known to erupt in recorded human history. 
The vast majority are hundreds of millions of 
years old. “All of us volcanologists would be 
willing to chip in and have a new eruption of 
kimberlites,” Russell says. 

During COVID-19 lockdowns, Gernon 
found himself returning to the ques- 
tion of why these eruptions only occur in 
cratons—the old, cold interiors of continents. 
Although some researchers believed mantle 
plumes could be the source, many kimberlite 
deposits don’t match known hot spot tracks. 
In addition, geochemical analysis of isotopes 
in mantle plume rocks suggest they originate 
in the lower mantle, whereas recent analyses 
of kimberlites point to a shallower origin. 

Gernon and his co-authors noticed how 
the timing of kimberlite eruptions did seem 
to match up with landmark events in plate 
tectonics. They reconstructed the move- 
ment of the continental plates over the past 
500 million years, comparing the rate of con- 
tinental rifting with the bursts of kimberlite 
formation. The analysis showed kimber- 
lite eruptions seemed to peak, on average, 
26 million years after a continental breakup. 
“That was curious and jaw dropping,’ 
Gernon says. 

They then zoomed in on the geologic his- 
tory of kimberlite deposits in southern Africa 


SCIENCE science.org 


and South America, which formed after the 
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formed in North America after the earlier 
crack up of Pangaea. The kimberlite volca- 
noes popped off progressively farther from 
the rift over time, the clusters shifting some 
20 kilometers every million years. 

Gernon and his colleagues think they 
know what drives this migration. As conti- 
nents split apart, hot mantle rocks well up to 
fill the gap. But they cool off and sink as they 
rub up against the cold continental sides of 
the gap, creating whirling convective pat- 
terns. Computer modeling shows these vor- 
tices travel along the keels of the continents, 
stripping their mushy roots and creating a 
rock mix perfect for melting into kimberlite. 
The simulations show the waves crawling 
along at a pace that matches the propaga- 
tion of kimberlites: about one-millionth of 
a snail’s pace. “It’s a really clever idea,” says 
Jeroen van Hunen, a geodynamicist at Dur- 
ham University. “It makes perfect sense.” 

It’s also a big claim, as the powerful 
waves would strip some 40 kilometers of 
rock from the base of continents. But some 
other evidence seems to support it. For 
example, not only do kimberlites migrate 
out from rifts over time, but their mix of 
isotopes also shifts, from patterns that re- 
semble a mantle-crust mix where the wave 
first breaks to a more uniform upper man- 
tle composition as the wave dies out. And 
in the Kaapvaal craton of southern Africa, 
for example, the continent saw several kilo- 
meters of uplift around the same time as the 
kimberlite eruptions. The uplift suggests the 
wave was underfoot at the time, stripping off 
the continent’s undercarriage and allowing it 
to rise like a hot air balloon shedding its bal- 
last. “Combined, this evidence is really com- 
pelling,” Gernon says. 

It’s unlikely the team has found the sin- 
gle cause for kimberlites, given how noisy 
the data are, and how much Earth’s quirks 
vary from place to place, says Karen Smit, a 
geochemist at the University of the Witwa- 
tersrand. “It’s a model that makes sense. I 
just don’t know if that correlation exists glob- 
ally” But Kolawole says the study is likely to 
prompt a surge of work in regions such as 
the Gulf of California or the Red Sea, where 
incipient rifting might be creating the deep 
waves, which seismic observations could 
reveal. The theory might also explain some 
volcanic deposits that were previously attrib- 
uted to plumes, he adds. 

The greatest interest in the study may come 
from commercial diamond miners, Gernon 
says. In theory, it could help predict the 
location of undiscovered kimberlites, 
he says. “You should be able to pinpoint, 
roughly, the sweet spot for diamonds.” & 
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any regular visitors to the Miitter 

Museum in Philadelphia have 

their favorite specimen. There’s the 

megacolon—a 2.4-meter-long brown 

organ, the result of Hirschsprung 

disease in a 29-year-old man who 
performed at a freak show as Balloon Man 
and died in 1892 with 18 kilograms of poop 
in his bowels. There’s the Soap Lady, whose 
remains are covered with adipocere, or 
“corpse wax,’ a fatty substance that forms 
in warm, alkaline, and airless environments. 
And there are the skeletons of Carol Orzel 
and Harry Eastlack, two Philadelphians who 
lived with fibrodysplasia ossificans progres- 
siva, in which connective tissue slowly turns 
into bone. 

For decades, the 35,000 objects and speci- 
mens in the storied museum, which just 
completed a $3.2 million renovation of its 
storage and lab facilities, have attracted le- 
gions of fans, including some members of 
the disability community who see themselves 
represented in its exhibits. But in February, 
members noticed that most of the museum’s 
images and videos had disappeared from its 
website and YouTube channel without ex- 
planation, although the objects themselves 
remained on display. After a public outcry, 
the leadership of the museum and the Col- 
lege of Physicians of Philadelphia, which 
owns the Miittter, revealed the reason: They 
have launched a thorough ethical review of 
the museum’s handling and display of its 
6500 specimens of human remains. 

Although turnoffs for some, the Miitter’s 
skeletons, skulls, and body parts, displayed 
in a 19th century cabinet of curiosities style, 
are a source of knowledge as well as fascina- 
tion. Bioarchaeologist Molly Zuckerman still 
remembers her first visit to the Miitter, when 
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she was 21. “I [found out] how much I really 
enjoyed learning about health and disease,” 
recalls Zuckerman, now at Mississippi State 
University. By displaying extreme examples 
of human anatomy, she says the Mittter “re- 
ally shows us, in a way that other resources 
don’t, the capabilities of the body.” 

But ethical standards for collecting and 
displaying human specimens have changed 
over the years. When Kate Quinn, executive 
director of the Miitter, took over leadership 
in September 2022, she was surprised to 
find the museum didn’t have specific ethics 
policies. “That makes it all the more impor- 
tant for us to move forward together to cre- 
ate an ethics policy for all aspects of [our] 
work,” she says. 

“T see it as taking the right path for those 
individuals who cannot speak for themselves 


Thomas Dent Miitter, a 19th century Phila- 
delphia surgeon who operated on hundreds 
of patients with unusual anatomy. Physicians 
of the time often kept body parts excised 
during surgery and collected unclaimed bod- 
ies, practices legal at the time but unethical 
today. Around 1875, for example, anatomist 
Joseph Leidy became aware that bodies from 
the cemetery where the Soap Lady was bur- 
ied were being relocated. He lied to the grave 
digger, stating she was his grandmother, and 
obtained the saponified body, which he later 
donated to the Miitter. The museum also dis- 
plays slides of Albert Einstein’s brain, which 
was dissected without his or his family’s con- 
sent. “[Physicians back then] collected people 
like kids collect Pokémon,” de la Cova says. 
These days, she says, “Scholars who work 
with the dead think about better ways of en- 
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The Miitter Museum in Philadelphia, shown in 2013, displays specimens in a cabinet of curiosities style. 


[and] did not know that they were going to 
end up in this museum,’ says Carlina de la 
Cova, a biological anthropologist at the Uni- 
versity of South Carolina. 

But some are unhappy with how the re- 
view is being done. An online petition called 
Protect the Integrity of the Miitter Mu- 
seum, led by museum members and mem- 
bers of the public, has amassed more than 
31,000 signatures demanding the reinstate- 
ment of all online content, as well as more 
transparency from leaders about their deci- 
sions. The controversy illuminates the thicket 
of ethical issues faced by museums, especially 
those focused on human anatomy and dis- 
ease, as they reckon with the history behind 
their collections. “You end up reexamining 
yourself,’ says Mira Irons, president and CEO 
of the College of Physicians. “It’s painful.” 

The Miitter started with a donation of 
1700 objects from the private collection of 
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gaging with [them].’ In Philadelphia, an im- 
petus came in 2021 when antiracism activists 
protested the news that the Penn Museum— 
blocks away from the Miitter—had retained, 
for teaching and research, the remains of vic- 
tims of the 1985 MOVE bombing, in which 
the police killed members of a Black separat- 
ist group. 

The Miitter is now returning the remains 
of seven Native Americans to communities 
in New Jersey and California, as required 
by federal law. And it launched the ethical 
review of all of its 450 YouTube videos and 
website images in January. Quinn gathered a 
multidisciplinary group of about 20 experts 
that includes members of the museum, col- 
lege fellows, and the disability community 
to be sure the human remains shown are 
treated with respect. In one video that is 
likely to be dropped or edited, former staff 
jokingly pretend to brush the teeth of skulls. 


The plan is to finish the review by early Sep- 
tember, Quinn says. 

Staff began to audit every human speci- 
men in May, reviewing how it was acquired 
and how it is displayed—a process that will 
take up to 48 months. For each human speci- 
men, “we need to become much more aware 
of the circumstances of their lives and the 
times in which they lived,’ Quinn says. 

Take the Soap Lady. “We know she was 
taken from her grave,’ Quinn says. But the 
museum doesn’t have information on her 
identity or records of her or her family’s con- 
sent for donation—let alone display. “Person- 
ally, do I believe she should go back into the 
ground? Yeah, probably.’ But that decision 
will be up to the college’s Board of Trustees, 
after hearing the community’s perspectives 
and discussing whether there’s a strong med- 
ical reason to keep the remains on display. 
“That’s one case,” Quinn says. “There’s 6500 
of them.” 

For some specimens, such as the mega- 
colon, what is missing is contextual informa- 
tion. “It’s a little bit like a circus,” says Sabine 
Hildebrandt, an anatomy educator and re- 
searcher on the history and ethics of anatomy 
at Boston Children’s Hospital and Harvard 
Medical School. The specimen is currently 
displayed as an “object of curiosity,’ she says, 
rather than what it really is: “witness to the 
suffering of a person.” 

The museum’s new direction irks Robert 
Pendarvis. In 2021, he donated his enlarged 
heart, the result of a condition called acro- 
megaly, to the Miitter after getting a trans- 
plant at Duke University Medical Center. He 
wanted his information public and was upset 
when a video of his heart was taken down, 
according to a report by a Philadelphia radio 
station. (His heart is still on display.) 

Others have broader complaints. Although 
Ezra Eisenstein, an organizer of the Protect 
the Miitter campaign, agrees with the need 
for the ethical review, “What are the review 
criteria? Who’s doing the reviews?” he asks. 


“There's just a complete lack of transparency . 


or accountability.” 

“We are being as transparent as we can,’ 
Quinn says. “We’re open to all conversations.” 
The museum is now providing updates to the 
public in a dedicated page on its website. 

She and other experts stress the value of 
the displays. Zuckerman notes that the Miit- 
ter houses bones bearing scars from the late 
stages of syphilis, rare in patients today. See- 
ing the early examples helps clinicians rec- 
ognize the full expression of the untreated 
disease, she says. “They’ve never seen con- 
genital syphilis before.” 

But, Quinn promises, “Everything that we 
will be doing moving forward is out of re- 
spect for the humans involved in this conver- 


sation, both living and dead.” & 
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Seth Berkley (left), here in Africa in 2019 during an outbreak of Ebola, has long championed vaccines. 


INFECTIOUS DISEASES 


Childhood vaccine crusader 
shares concerns for future 


As Seth Berkley leaves Gavi, the Vaccine Alliance, the 
epidemiologist reflects on its growth and challenges ahead 


By Jon Cohen 


y the start of 2011, the year epidemiologist Seth Berkley became CEO of Gavi, the 
Vaccine Alliance, the nonprofit had over its 11-year history supported the immu- 
nization of 288 million children in poor countries. But it also had a $3.7 billion 
funding gap between its plans and donor financial pledges. Berkley intends to 
step down from Gavi’s helm on 2 August, arguably leaving it in a better place. Be- 
tween 2011 and 2022, it raised $33.3 billion in financing from governments, philan- 


thropies, and industry—a jump of 683% from its first decade. And by now, Gavi has helped 


low- and middle-income countries vaccinate 1 billion children against a widening range 
of diseases. Gavi also played a central role in creating and running the COVID-19 Vaccines 
Global Access (COVAX) Facility. Although COVAX fell short of its goal of getting vaccines 
to lower and middle-income countries at the same time as they were rolled out to wealthy 


countries, COVAX still provided nearly 2 billion doses. Science recently spoke with Berkley 


about Gavi’s past and future. This interview has been edited for clarity and brevity. (A full 


version is at https://scim.ag/SBerkleyQA.) 


Q: In Gavi-supported countries, 77% of 
children in 2021 had a complete course of 
vaccinations with shots that protect against 
diphtheria, tetanus, and pertussis—a jump 
of 19% since 2000 and only 4% shy of the 
global rate. Is that Gavi’s greatest success? 
A: No, no, I wouldn’t think that. With other 
vaccines, against hepatitis B, haemophilus 
influenzae B, and pneumococcal disease, 
we went from, in essence, zero to 40%, 
50%, 60%. So it’s not just the number of 
children being reached, but it’s the number 
of vaccines that they have been provided. 
Now, when I think of successes, the most 
important one is the 70% reduction in 
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vaccine-preventable disease deaths in Gavi 
countries, which then contributed to the 
more than 50% reduction in the under 
[age] 5 mortality rate. Those are the kinds 
of things I’m proudest of. 


Q: Gavi has had 19 countries transition to 
self-financing, and other countries are now 
cofinancing another $1.5 billion of the vac- 
cination costs. Do you see Gavi fading out? 
A: Many countries started out at such low 
income that it is going to be a long time. 
Rwanda, even though it’s made great 
strides, is a very poor country. The other 
part, of course, is fragility. Mali, South Su- 


dan, Yemen, Somalia, Haiti—they will n 
be on their own for a long time. 


Q: A vaccine against respiratory syncytial 
virus may soon be approved for pregnant 
people because their antibodies will protect 
their newborns. Would Gavi fund that? 

A: Yes, absolutely. It depends on what the 
vaccine ultimately costs. 


Q: What do you think of COVAX’s 
performance? 

A: There’s always going to be a day one 
problem: The richest, most powerful 
countries are going to want the avail- 
able doses of vaccine, but what we can 

do is improve speed, efficiency, and the 
ability to scale vaccine production and 
delivery. If you look at the 92 countries 
[who signed onto COVAX], they have 55% 
primary coverage with COVID-19 vac- 
cines, as compared to 65% globally. It’s 
not equitable, it was delayed, but it’s bet- 
ter than it’s ever been. My worry is if you 
say, “COVAX was a complete failure,” what 
that means is then, OK, we’re not going to 
learn the lessons, we’re not going to build 
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on the positives. That’s what I think has to ‘ 


happen now. 


Q: Gavi recently introduced a vaccine that 
works against six diseases. That obviously 
simplifies delivery of shots. In the future, do 
you foresee, say, a 20-in-one shot? 

A: 1 suspect what will happen is instead of 
going to 20, youre going to have a respira- 
tory disease vaccine for COVID, flu, and 
RSV, and a vaccine for infants that covers a 
range of diseases, and maybe a combination 
vaccine for the pregnant women that covers 
group B strep, RSV, and a diphtheria/tetanus 


booster. That’s probably the way we'd go first : 


before we went to one master vaccine. 


Q: What are the biggest challenges for 

your successor? 

A: It’s a tough fiscal time, so making sure 
that there is continued support for the prior- 
ity of vaccines and public health is really im- 
portant. Climate change is the biggest, most 
important problem going forward. It’s going 
to lead to a polyepidemic [as new pathogens 
emerge from disrupted ecosystems]. We’re 
going to have massive movement of people, 
water challenges, desertification, displaced 
people, and that’s going to all lead to disease, 
because it’s going to affect ecosystems. And 
then we hit the wall of populations in all 
countries getting older. We should start 
thinking about providing vaccines for older 
people in developing countries. That is 

scary to donors, because it’s more money 
and more work, but at the end, what other 
choice do we have? & 
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SCIENTIFIC INTEGRITY 


‘| should have done better: 
Stanford head steps down 


Probe clears Marc Tessier-Lavigne of misconduct but 
criticizes lab culture and lack of “appetite” for corrections 


By Jocelyn Kaiser 


hen a student journalist’s al- 

legations of research fraud in 

past work by Stanford Univer- 

sity President Marc _ Tessier- 

Lavigne triggered an investiga- 

tion late last year, many onlookers 
thought the award-winning neuroscientist 
would emerge unscathed. But although 
the probe last week cleared him of fraud 
and any cover-up, its exoneration wasn’t as 
complete as he must have hoped. Tessier- 
Lavigne announced he will step down next 
month, after the probe identified data ma- 
nipulation by members of his lab in papers 
dating back to 1999, faulted him for a lab 
culture that favored “winners,” and said he 
had “failed to decisively and forthrightly 
correct” errors in his papers. 

Tessier-Lavigne, 63, whose research focus 
has ranged from spinal cord development 
to Alzheimer’s disease, said he would seek 
retractions or corrections on five papers. 
He also acknowledged that the investiga- 
tory scientific panel engaged by Stanford’s 
Board of Trustees “identified some areas 
where I should have done better.” Tessier- 
Lavigne will remain a faculty member and 
continue to run a lab at the school. 

Some Stanford faculty say in the end, they 
weren't surprised by the resignation. “This 
has been a cloud hanging over Stanford for 
quite a while now ... a serious cloud,” says 
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Hank Greely, a legal expert and bioethics 
scholar. “This is enough exoneration that he 
can go out with dignity.” 

Still, the news saddened some of Tessier- 
Lavigne’s colleagues. “In my experience, Marc 
is a person of the highest integrity, intelli- 
gence, and generosity,” says Cori Bargmann, 
a neurobiologist at Rockefeller University 
who describes Tessier-Lavigne as a long- 
time friend. He was “excellent” in the role 
of president as well, she adds 

Tessier-Lavigne is best known for his 
work in the 1990s discovering netrins, 
proteins that guide the growth of nerve 
cell projections called axons. His troubles 
began in November 2022, when The Stan- 
ford Daily reported that The EMBO Journal 
was investigating alleged image manipula- 
tion in a 2008 article, and that a research 
fraud expert had confirmed similar possible 
problems in other work from his lab, which 
had been initially discussed on PubPeer, a 
forum where scientists, often anonymously, 
discuss irregularities in papers. 

Concerns about three of those papers— 
one in Cell in 1999 and two in Science in 
2001—had first come up on PubPeer in 2015, 
when Tessier-Lavigne was under consider- 
ation for the Stanford presidency. Tessier- 
Lavigne, then president of Rockefeller, sub- 
mitted corrections to both journals, but 
Science failed to publish them because of an 
editorial error and Cell found that a correc- 
tion wasn’t needed. After the new publicity, 


Marc Tessier-Lavigne will keep his lab on Stanfor¢ 
University’s campus. 


both journals in December 2022 added ex- 
pressions of concern to the papers. 

In January, a special committee convened 
by Stanford’s board contracted Mark Filip, 
a former federal judge, to lead a probe with 
a panel of five prominent scientists: neuro- 
scientists Hollis Cline of Scripps Research 
and Kafui Dzirasa of Duke University; Steven 
Hyman, provost emeritus at Harvard Uni- 
versity; Randy Schekman, a cell biologist at 
the University of California, Berkeley, and 
former editor-in-chief of the Proceedings of 
the National Academy of Sciences; and mo- 
lecular biologist Shirley Tilghman, former 
president of Princeton University. 

Tessier-Lavigne’s position became more 
precarious a month later, when The Stan- 
ford Daily charged he had covered up other 
misconduct. It reported that in 2011, while 
he was head of Genentech’s research arm, 
the company began investigating and found 
fraudulent data in a high-profile 2009 Na- 
ture paper from his lab that described an 
unexpected role for a protein in causing the 
neurodegeneration of Alzheimer’s. Based on 
interviews with several former Genentech 
employees, the school paper reported that 
Tessier-Lavigne suppressed the company’s 
findings and refused to retract the paper. 
The Stanford president called the reporting 
“replete with falsehoods” and the allegations 
“breathtakingly outrageous.” 

In April, Genentech released its own inves- 
tigation, noting that although its scientists 
at the time could not replicate the Nature 
paper’s key findings, the company had no 
records of a misconduct investigation. But 
Genentech also disclosed that in 2010 it had 
dismissed a postdoc in Tessier-Lavigne’s lab 
for misconduct on a different manuscript 
that was withdrawn before publication. 

Inall, the Stanford-commissioned scientific 
panel reviewed a dozen of Tessier-Lavigne’s 
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papers. Its report, released on 19 July, found . 


that for seven papers on which he was a 
middle, or secondary, author, the primary 
authors have taken responsibility for any 
data manipulation, and he bears no blame. 

But the panel unearthed “serious flaws” in 
the five papers on which he is a correspond- 
ing author: the 1999 Cell paper, the two 2001 
Science papers, a 2004 Nature paper, and the 
2009 Nature paper from Genentech. The first 
four had signs of “apparent manipulation of 
research data by others,’ such as an image 
of a Western blot, a type of protein analysis, 
that is presented as data in three different 
experiments. Tessier-Lavigne had no “actual 
knowledge” of manipulated data before the 
papers were published, the report says. 

The 2009 Nature paper, meanwhile, had 
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“multiple problems” and showed a “lack of 
rigor,’ but the report finds that The Stanford 
Daily’s allegations of fraud and a cover-up 
at Genentech “appear to be mistaken.” (The 
school paper, however, noted last week that 
some relevant sources declined to talk to the 
Stanford panel because it would not guaran- 
tee anonymity.) 

The report also concludes that Tessier- 
Lavigne did not respond adequately when 
concerns about his work were raised at 
four different points over 2 decades. For 
example, it chides him for failing to fol- 
low up when Science did not publish the 
corrections he submitted, and for his “sub- 
optimal” decision not to correct or retract 
the 2009 Nature paper; instead, he and col- 
leagues published follow-up papers revising 
the findings. Without “an appropriate appe- 
tite” for corrections, “the often-claimed self- 
correcting nature of the scientific process 
will not occur,” the report says. 

The report laid some of the blame for the 
“unusual frequency” of manipulated data 
or lax practices on the culture of Tessier- 
Lavigne’s labs. Former postdocs told the 
panel that “winners” who produced favor- 
able results were rewarded, whereas “los- 
ers” were marginalized. Other postdocs, 
however, spoke positively to Science about 
Tessier-Lavigne’s mentoring and challenged 
the claim that he mismanaged his lab. 

Nicholas Hertz, co-founder and chief 
scientific officer of San Francisco-based 
startup Mitokinin, who worked with him 
at Rockefeller, says it was one of the most 
scientifically rigorous research groups he’s 
ever experienced. Although the atmosphere 
could be competitive, Hertz says that is “a 
very normal part of lab culture” anywhere, 
particularly in high-profile groups. 

Elisabeth Bik, the consultant who helped 
The Stanford Daily identify problems in 
Tessier-Lavigne’s papers, says it was appro- 
priate for him to step down as Stanford’s 
leader. “There were multiple cases of mis- 
conduct done under his watch,’ says Bik, 
who believes he bears responsibility for not 
catching the manipulated data. 

Tessier-Lavigne expects to retract the Cell 
and two Science papers and correct the two 
Nature papers. Science and Cell say the re- 
tractions are already in motion. Nature says 
the journal “will follow up with the most ap- 
propriate course of action to ensure the in- 
tegrity of the scientific record.” 

For his part, Tessier-Lavigne said he “will 
be further tightening controls” in his current 
lab at Stanford, for example by matching 
processed images with raw data. His goal is 
“to ensure that these kinds of problems do 
not recur.” 


With reporting by Catherine Offord. 
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Studies find little impact of 
social media on polarization 


But some are uneasy about Meta’s role 
in unprecedented research collaboration 


By Kai Kupferschmidt 


he bitter 2020 U.S. presidential elec- 

tion still reverberates in the courts 

and the media—and, in results out 

today, in scientific journals. Three pa- 

pers in Science and one in Nature pres- 

ent some of the first conclusions of 
the 2020 Facebook and Instagram Election 
Study (FIES), in which researchers joined 
with Meta, the company behind Instagram 
and Facebook, to study the effects of social 
media on the attitudes and behavior of tens 
of thousands of users during the election. 

“These are huge experiments,” says 
Stephan Lewandowsky, a University of Bris- 
tol psychologist who was not part of the work. 
“And these results are quite interesting.” The 
study found U.S. conservatives 
were exposed to vastly more 
false news stories on Facebook 
than liberals. But surprisingly, 
removing all reshared content 
from individuals’ feeds for 
3 months did not affect their 
political attitudes or make their 
views any less polarized. Nei- 
ther did switching them from a 
curated feed to simply showing 
the most recent posts first. 

But the way the research 
was done, in partnership with Meta, is get- 
ting as much scrutiny as the results them- 
selves. Meta collaborated with 17 outside 
scientists who were not paid by the com- 
pany, were free to decide what analyses to 
run, and were given final say over the con- 
tent of the research papers. But to protect 
the privacy of Facebook and Instagram 
users, the outside researchers were not al- 
lowed to handle the raw data. 

This is not how research on the potential 
dangers of social media should be conducted, 
says Joe Bak-Coleman, a social scientist at 
the Columbia School of Journalism. “We 
have clear, long-standing concerns about the 
impact of [this] technology,’ he says. But he 
calls the partnership “restrictive” and ques- 
tions “relying entirely on the company in 
question to process the raw data into what is 
eventually analyzed.” 

That social media influences users’ politi- 


“No one is 
saying this 
means that social 
media has no 
negative effects.” 


Brendan Nyhan, 
Dartmouth College 


cal views and stokes divisions has been a 
long-standing concern. Critics are par- 
ticularly worried about “filter bubbles” 
of like-minded users engaging with one 
another and the influence of inscrutable 
algorithms that determine what posts in- 
dividual users see. Previous research sug- 
gests these features foster polarization and 
help false information and divisive or hate- 
ful content spread faster and wider. But 
studying these effects has been difficult 
for researchers because they do not have 
access to Meta’s data and could not experi- 
ment on users’ feeds, a kind of “holy grail 
of research,” says Andy Guess, a political 
scientist at Princeton University and lead 
author on two of the new FIES studies. 

The collaboration, which began after 
Meta reached out to the sci- 
entists in early 2020, opened 
the way to that kind of experi- 
ment. In three of the papers, 
the team studied whether 
changing aspects of users’ 
feeds over several months 
would affect various attitudes 
and behaviors. Each study 
looked at a separate group of 
about 23,000 users, recruited 
via invitations placed on the 
top of their feeds, and took 
place between late September and late De- 
cember 2020. 

In one experiment, the researchers pre- 
vented Facebook users from seeing any “re- 
shared” posts; in another, they displayed 
Instagram and Facebook feeds to users in 
reverse chronological order, instead of in 
an order curated by Meta’s algorithm. Both 
studies were published in Science. In a third 
study, published in Nature, the team reduced 
by one-third the number of posts Facebook 
users saw from “like-minded” sources—that 
is, people who share their political leanings. 

In each of the experiments, the tweaks did 
change the kind of content users saw: Re- 
moving reshared posts made people see far 
less political news and less news from un- 
trustworthy sources, for instance, but more 
uncivil content. Replacing the algorithm 
with a chronological feed led to people see- 
ing more untrustworthy content (because 
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“multiple problems” and showed a “lack of 
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particularly in high-profile groups. 
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The Stanford Daily identify problems in 
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priate for him to step down as Stanford’s 
leader. “There were multiple cases of mis- 
conduct done under his watch,’ says Bik, 
who believes he bears responsibility for not 
catching the manipulated data. 

Tessier-Lavigne expects to retract the Cell 
and two Science papers and correct the two 
Nature papers. Science and Cell say the re- 
tractions are already in motion. Nature says 
the journal “will follow up with the most ap- 
propriate course of action to ensure the in- 
tegrity of the scientific record.” 

For his part, Tessier-Lavigne said he “will 
be further tightening controls” in his current 
lab at Stanford, for example by matching 
processed images with raw data. His goal is 
“to ensure that these kinds of problems do 
not recur.” 
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the 2020 Facebook and Instagram Election 
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media on the attitudes and behavior of tens 
of thousands of users during the election. 

“These are huge experiments,” says 
Stephan Lewandowsky, a University of Bris- 
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“And these results are quite interesting.” The 
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were exposed to vastly more 
false news stories on Facebook 
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removing all reshared content 
from individuals’ feeds for 
3 months did not affect their 
political attitudes or make their 
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ther did switching them from a 
curated feed to simply showing 
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says Joe Bak-Coleman, a social scientist at 
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have clear, long-standing concerns about the 
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of like-minded users engaging with one 
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algorithms that determine what posts in- 
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the team studied whether 
changing aspects of users’ 
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would affect various attitudes 
and behaviors. Each study 
looked at a separate group of 
about 23,000 users, recruited 
via invitations placed on the 
top of their feeds, and took 
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vented Facebook users from seeing any “re- 
shared” posts; in another, they displayed 
Instagram and Facebook feeds to users in 
reverse chronological order, instead of in 
an order curated by Meta’s algorithm. Both 
studies were published in Science. In a third 
study, published in Nature, the team reduced 
by one-third the number of posts Facebook 
users saw from “like-minded” sources—that 
is, people who share their political leanings. 

In each of the experiments, the tweaks did 
change the kind of content users saw: Re- 
moving reshared posts made people see far 
less political news and less news from un- 
trustworthy sources, for instance, but more 
uncivil content. Replacing the algorithm 
with a chronological feed led to people see- 
ing more untrustworthy content (because 
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Meta’s algorithm downranks sources who 
repeatedly share misinformation), though 
it cut hateful and intolerant content almost 
in half. Users in the experiments also ended 
up spending much less time on the platforms 
than other users, suggesting they had be- 
come less compelling. 

But surveys during and at the end of the 
experiments showed these differences did 
not translate into measurable effects on us- 
ers’ attitudes. Participants didn’t differ from 
other users in how polarized their views were, 
for example, or in their knowledge about the 
elections, their trust in media and political 
institutions, or their belief in the legitimacy 


of the election. They also were no more or 
less likely to vote in the 2020 election. 

“What this tells us is that you can change 
people’s information diet, but you're not go- 
ing to immediately move the needle on these 
other things,’ Lewandowsky says. “I was a 
little surprised by that.” Still, changes in peo- 
ple’s attitudes could happen over longer time 
periods, he says, and the studies don’t take 
into account the fact that social media also 
has societal effects. For example, decisions 
made by social media platforms shape the 
way news outlets present stories or even how 
politicians communicate. But Guess notes 
that experimenters can only do so much. “We 
can’t go back in time, and turn off social me- 
dia for some people, or assign people to dif- 
ferent versions of social platforms, and then 
see what happens in the intervening years.” 

There are other limitations. Instead of just 
comparing Meta’s algorithm with a chrono- 
logical feed, Lewandowsky says he would 
have liked to see a comparison with an algo- 
rithm that downgrades material that is po- 
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larizing or produces outrage. Guess says he 
considered other algorithms but ultimately 
decided on a chronological feed as a com- 
parison in part because that is what Meta has 
used in the past and because some politicians 
and activists have proposed this as a way to 
alleviate some problems of social media. 

“No one is saying this means that social 
media has no negative effects,” says Brendan 
Nyhan, a political scientist at Dartmouth 
College and one of the researchers involved. 
“But these are three interventions that have 
been widely discussed and none of them 
measurably changed attitudes. So I think 
that that tells us something.” 


In another paper, also published in 
Science, researchers analyzed which news 
stories made it into the feeds of U.S. Face- 
book users, and correlated this with how 
liberal or conservative the users were 
according to Facebook’s own _ internal 
classifier. They found that liberals and con- 
servatives read and engaged with different 
sets of political news stories—and that the 
large majority of content labeled false by 
third-party fact-checking programs was 
seen by conservative audiences. Only a 
small fraction of all content was rated 
false, however, and much misinformation 
may be going under the radar, some of 
which could be concentrated on the left, 
says Sandra Gonzalez-Bail6on, a social sci- 
entist at the University of Pennsylvania 
(UPenn) and the first author of the paper. 

The four new papers are just the first in 
a planned series of 16 to come out of the 
project. Meta is likely to be pleased with the 
results so far, particularly those on the algo- 
rithmic feed, Lewandowsky says. “They can 


argue that polarization isn’t their problem, 
because it’s not their algorithm [at fault.]” 
But that’s simply how it turned out, Guess 
says. “I’m not that focused on whether people 
think this looks good or bad for Meta. ’m 
more interested in the scientific results and 
what we can learn about the impact of social 
media on politics.” 

Other researchers are more skeptical about 
the partnership. Jennifer Jacquet, a social sci- 
entist at New York University who has stud- 
ied how companies have undermined the 
scientific consensus on climate change, says 
collaborating with prestigious scientists is 
part of the playbook companies employ when 
threatened with regulation. “I think this is all 
part of the strategy of buying time, against 
any regulation, which they’ve been very good 
at doing,” she says. “There’s no doubt that 
these scientists [are] lending their credibil- 
ity to these companies themselves and that 
would make me a little bit uncomfortable.” 

But unless companies are forced to share 
their data, there are few ways to study them 
without collaborating, says Deen Freelon, a 
researcher at UPenn who took part in the 
study. “Absent some outside enforcer, we 
are at their mercy and that’s just how it is.” 
Freelon says the collaboration worked well. 
“T haven’t seen them ... prevent us from do- 
ing anything that we wanted to do.” 

Michael Wagner, a social scientist at the 
University of Wisconsin-Madison who was 
asked to observe the work and wrote a com- 
mentary accompanying the Science papers 
(see p. 388), says Meta’s business interests 
may have influenced the project at some 
points. For instance, he says Meta research- 
ers believed that the experimental studies 
changing users’ feeds were unlikely to show 
any big effects—and they pushed to get these 
papers done first. “You could read it as ‘the 
big splash is going to be that there aren’t 
huge effects that are so deleterious to democ- 
racy that we need to have a bunch of new 
regulations on our platform,” 

Chad Kiewiet de Jonge, a researcher at 
Meta, says although some researchers did 
have those expectations, Meta’s interests did 
not affect the timing of the papers. “Meta did 
not make the final call on prioritization—it 
was decided jointly by the Meta research 
leads and the academic co-chairs.” 

On the whole, Wagner says, the project 
“was a net good” for research on the effects 
that social media platforms have on elec- 
tions. Still, he says, it should not be seen as 
a model for the future, because Meta has 
too much power in the arrangement. It’s 
important for researchers to keep pushing 
for unrestricted access to these platforms, 
Lewandowsky adds. “What they did with 
these papers is not complete independence. 
I think we can all agree on that.” & 
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A company called Indigo is paying farmers 
to trap carbon in their soils. Some researchers say 
the climate benefits are dubious By Gabriel Popkin 


PHOTO: CAPELLE.R/GETTY IMAGES 


SCIENCE science.org 28 JULY 2023 + VOL 381 ISSUE 6656 369 


NEWS | FEATURES 


ance Unger has been doing things a 
little differently lately on his farm 
near the Wabash River in south- 
western Indiana. After last fall’s 
harvest, rather than leaving his 
fields fallow, he sowed some of 
them with cover crops of oats and 
sorghum that grew until the win- 
ter cold killed them off. And before 
planting corn and soybeans this spring, 
Unger drove a machine to shove aside yel- 
lowing stalks—last season’s “trash,” as he 
calls it—rather than tilling the soil and 
plowing the stalks under. 

For these efforts, a Boston-based com- 
pany called Indigo paid Unger $26,232 in 
late 2021 and an even larger chunk late last 
year. That’s how much an emerging market 
values the hundreds of tons of carbon that, 
in theory at least, Unger yanked out of the 
atmosphere with his cover crops or left 
in the soil by not tilling. Slowing climate 
change isn’t a priority for him, he says, 
and it hasn’t been easy to change his long- 
standing farming practices. But he says the 
money made it worthwhile. “I need to see 
economic benefits.” 

Indigo also made money in the deal. It 
took a 25% cut of the bundle of credits it 
then sold at about $40 per ton of captured 
carbon. Buyers were companies such as 
IBM, JPMorgan Chase, and Shopify, which 
were looking to offset greenhouse gas 


emissions from their operations and bol- 
ster their green bona fides. 

For advocates, the exchange represents a 
beautiful marriage of idealism and capital- 
ism in the service of an urgently needed 
climate solution. If applied across the 
globe’s farmland, soil-based carbon cap- 
ture could offset between 5% and 15% of 
greenhouse gas emissions every year, ac- 
cording to an influential 2004 study by 
Ohio State University soil scientist Rattan 
Lal. “I and many other scientists have a lot 
of confidence that we can build carbon in 
soil,” says Deborah Bossio, lead soil scien- 
tist for the Nature Conservancy. 

Millions of dollars of soil credits have 
already been sold, and companies like In- 
digo are ramping up aggressively to claim 
a piece of an industry that could overall 
be worth $50 billion by 2030, according 
the consulting firm McKinsey & Company. 
With other carbon markets based on plant- 
ing or preserving trees facing accusations 
of peddling questionable or outright fraud- 
ulent credits, some buyers may see soil as 
a safer option. 

But as the industry heats up, so does the 
skepticism. Some researchers say the sci- 
ence of how soils store and release carbon 
is too uncertain to support an industry 
claiming to be cooling the planet. They ac- 
cuse companies like Indigo of exaggerating 
the benefits of their programs. 


“I think the eagerness has sort of dis- 
torted the vision of what is really possi- 
ble,” says Ernie Marx, a soil scientist who 
retired from Colorado State University 
(CSU) in 2021 and worked for more than 
a decade on the computer model Indigo 
and other companies use to calculate the 
credits. Emily Oldfield, a soil scientist 
with the Environmental Defense Fund 
who has examined soil-based carbon mar- 
kets, also has her doubts. “It’s really hard to 
evaluate the actual greenhouse gas benefit of 
these programs.” 


ONE THING is not in dispute: Modern agri- 
culture has not been kind to soils—or the 
climate. Over millennia, microbes converted 
some of the carbon in dead trees and plants 
into long-lasting forms, building rich soils 
around the world. But since humans started 
plowing and disturbing soils some 12,000 . 
years ago, about 116 billion tons of carbon 
have been lost, either eroded away by wind 
and water or digested by microbes and re- 
spired to the atmosphere as carbon dioxide 
(CO,), scientists estimated in a 2017 study. 
So-called regenerative practices are sup- ; 
posed to build and protect soil carbon rather 
than release it. Some of the world’s biggest 
food giants, including General Mills, Land 
O’Lakes, and Cargill, have embraced the 
movement and claim to be reducing the cli- * 
mate impact of their supply chains by pay- 


Healthy soils, healthy planet 


Modern agriculture has not been kind to soils: Billions of tons of carbon have been lost to the atmosphere or eroded away. Regenerative practices can increase 
soil health and store carbon, slowing climate change and generating carbon credits that can be sold. But calculating the benefits is tricky. 


Reduced tillage Cover crops 


Planting crops without turning 
over the soil first 


Depleted soils can accumulate 
carbon over time when farmers 
reduce or eliminate tillage. 


Decaying crop residues release 
carbon in the first few years. 
Reduced tillage is already widely 
practiced, limiting the potential 
for additional carbon gains. 
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Planting of rye or other crops 
outside of the growing season 


As they grow, cover crops 
draw carbon into roots that stay 
in the soil after plants die. 


To avoid interfering with 

cash crop plantings, cover 
crops often aren't grown 

long enough to sequester a 
meaningful amount of carbon. 


Precision fertilizing 


What it means 


Applying only as much fertilizer 
as crops need, when they need it 


How it works 
Less fertilizer means soil 
microbes emit less nitrous oxide 
(N20), a powerful greenhouse gas. 


Why it might not work 
Farmers may fear that reducing 
fertilizer will lower yields. 
Modeling N20 emissions is 
difficult, making it hard to 
credit farmers for reductions. 


Grazing management 


Moving cattle and other grazing 
animals between pastures 


Grasses and other forage 
plants regrow, adding carbon 
to the soil through their roots. 


Soil carbon in pastures 
accumulates slowly. Grass-fed 
cattle can emit more methane 
than feedlot cows, offsetting 
climate benefits. 


Reduced fossil fuel use _ 


Reducing tractor driving 
and machinery use 


Eliminating tillage can 
also reduce tractor trips, 
lowering emissions. 


Fossil fuel use is a small fraction 
of farms’ overall emissions. Some 
tractor driving is still necessary, 
and cover cropping can require 
additional passes. 


science.org SCIENCE 


PHOTOS: (TOP TO BOTTOM) EDWIN REMSBERG/VWPICS/UNIVERSAL IMAGES GROUP VIA GETTY IMAGES; INDIGO AG 


ing farmers to adopt regenerative tactics. 
The U.S. government is also pumping bil- 
lions of dollars into what it calls “climate- 
smart agriculture.” 

A meta-analysis of data from experimental 
plots published in May offered encourage- 
ment. It found that no-till and cover crop- 
ping each increased topsoil carbon by an 
average of more than 11%, although the prac- 
tices had to be applied for at least 6 years to 
generate significant gains. 

Other recent findings have tempered some 
of the enthusiasm. Many studies show that 
when tillage is reduced, carbon accumulates 
in the topmost soil layers. But scientists who 
dug deeper often found offsetting losses, in 
part because crop residues that tilling would 
have driven into deeper soils instead decom- 
pose on the surface and release carbon into 
the atmosphere. And farmers usually till 
every few years anyway, to counter weeds 
and break up compacted soil, releasing 
much of the stored carbon in upper soil 
layers. When it comes to carbon, “I’ve 
never seen too much of a benefit in no- 
till alone,’ says Jon Sanderman, a soil 
scientist at the Woodwell Climate Re- 
search Center. 

Many researchers have higher 
hopes for off-season cover crops such 
as rye or radishes, whose carbon-rich 
roots sequester carbon in the soil. 
But cover crops also have drawbacks. 
They can delay or complicate plant- 
ing of the cash crop, so farmers often 
kill them early, sacrificing some of 
their benefits. And in colder regions, 
including much of the U.S. Corn Belt, 
the window between the fall harvest 
and winter is often too short and cold 
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A Maryland farmer plants a cover crop after a corn harvest, a practice that can pull carbon from the air and store it in soils. 


for cover crops to germinate and grow. 

But perhaps the biggest impediment to 
the widespread adoption of climate-friendly 
farming is the lack of a practical way to 
quantify the soil carbon gained through a 
regenerative tactic. Even harder to measure 
are emissions of nitrous oxide (N,O), a potent 
greenhouse gas released by soil microbes di- 
gesting nitrogen fertilizer that accounts for 
about 6% of total climate warming. Precise 
measurements would require expensive in- 
struments and soil coring campaigns. “We 
don’t have a soil carbon thermometer to 
stick in the ground,” says Keith Paustian, a 
CSU soil scientist and consultant for Indigo. 
“A farmer can’t just check his N,O meter 
once a day.” 

These issues have bedeviled companies 
trying to commercialize soil carbon. In 2019, 
Seattle-based startup Nori announced it 
had sold the first ever soil carbon credits, 


Cylindrical cores pulled from fields are the gold standard 
for measuring carbon in soil. 


> 


generated by a grower in Maryland. But its 
methods faced criticism. Not only did Nori 
collect no soil samples, it also did not have 
its credits validated by a registry—a third- 
party entity intended to add transparency 
and rigor to carbon markets. Scientists at 
Indigo thought they could do better. 

The privately held company, which has 
raised more than $1 billion, had a different 
agenda when it launched in 2013. At that 
time, its focus was beneficial microbes that, 
when applied to crop seeds, were supposed 
to help plants grow faster and more resil- 
iently. The company later set up a commodi- 
ties marketplace to try to help farmers earn 
premiums for sustainably grown grain—a 
venture that didn’t go as planned. In 2019, 
the company announced it was getting into 
the soil carbon credit business. 

Unlike Nori, Indigo chose to work with a 
third-party registry called the Climate Action 
Reserve, known for its work in Califor- 
nia’s regulatory carbon market. And it 
would require 30-centimeter-deep soil 
cores, taken every 5 years, from 10% 
to 30% of participating farm fields— 
enough, the company’s scientists cal- 
culated, to assess the total sequestered 
carbon with reasonable accuracy. The 
company would also rely heavily on 
an academic computer model, devel- 
oped at CSU and funded by the USS. 
Department of Agriculture (USDA), to 
estimate the climate benefits of farm- 
ing practices. 


THE MODEL, launched in the 1980s, was 
originally called Century because it sim- 
ulated soil carbon dynamics on time- 
scales of a century or longer. As con- 
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cerns about climate change grew, the CSU 
team looked to expand the model to capture 
how the three major greenhouse gases— 
CO,, methane, and N,O—pass between air 
and land during a growing season. But 
Century modeled changes in monthly time 
steps, far too coarse to capture real-world 
greenhouse gas fluxes. 

By the late 1990s, researchers had up- 
graded to a daily time step, and DayCent 
was born. It has since become one of the 
world’s most prominent soil models. Many 
of world’s top climate change-forecasting 
models include DayCent code, and the U.S. 
Environmental Protection Agency relies on 
its results for annual emissions reports to 
the United Nations. 

Despite its prominence, DayCent has 
plenty of shortcomings. It doesn’t ex- 
plicitly represent how soils actually 
work, with billions of microbes feast- 
ing on plant carbon and respiring 
much of it back to the atmosphere— 
while converting some of it to miner- 
alized forms that can stick around for 
centuries. Instead, the model estimates 
soil carbon gains and losses based on 
parameters tuned using published ex- 
perimental results. 

Another challenge is accounting for 
N,O, which soil microbes can belch 
out suddenly in big pulses. Without 
explicitly representing microbial activ- 
ity, DayCent has struggled to predict 
when soils will vent the gas, and how 
much. “N,O is even more uncertain 
than [soil organic carbon],” says Ram 
Gurung, a CSU statistician who works 
on the model. 

A third weakness: To tune and 
ground-truth DayCent, researchers 
rely on data from a modest number 
of university and government field 
trials that do not always mimic real 
farm conditions accurately. It’s far too 
puny a data set to represent the vast, 
varied landscapes and farming systems in 
the United States, much less the world, says 
Stephen Ogle, a CSU soil scientist and one 
of DayCent’s primary developers. “I would 
say we're data hungry.” 

The limitations saddle the model output 
with uncertainties that can be especially 
large for small areas. A 2010 study led by 
Ogle found these uncertainties could exceed 
100% for a particular farm or even a state- 
size region, meaning the model could not 
say whether soil carbon had accumulated or 
decreased over time. 

Nevertheless, some CSU _ researchers, 
along with their USDA sponsors, wanted to 
make DayCent modeling more publicly ac- 
cessible. In the early 2010s, they released 
COMET-Farm, a web tool based largely on 
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DayCent. Farmers could input information 
about their fields and proposed practice 
changes, such as reducing tillage or intro- 
ducing cover crops, and obtain an estimate 
of the carbon they would sequester. 
Companies began to show interest in us- 
ing DayCent and COMET-Farm for carbon 
markets. Marx says concerns about the gi- 
ant uncertainties were increasingly brushed 
under the rug, giving way to what he calls 
a “gold rush mentality.” In 2019, Paustian, 
leader of the CSU modeling team, founded 
a company called Soil Metrics to provide 
commercial access to DayCent. Both Nori 
and Indigo became clients, and in 2021, 
Indigo acquired Soil Metrics outright, with 
Paustian becoming a consultant. “I never 
wanted to work in business,” Paustian says, 


Young corn pushes through previous years’ stubble. Planting 
without tilling is thought to improve soil health and store carbon. 


but his research group couldn’t keep up 
with the requests they were getting for help. 


AROUND THIS TIME, Paustian and other CSU 
researchers launched a new project to bet- 
ter quantify the model’s uncertainties. Marx 
says the research team found that the uncer- 
tainties were still too large to tell whether 
a particular change in farming practice 
was having a positive or negative impact 
on the climate. The results were never pub- 
lished, and the public-facing COMET-Farm 
interface continues to state that meth- 
ods to estimate uncertainty are “currently 
under development.” 

Adam Chambers, a USDA program officer 
who funds and oversees the development of 
DayCent and COMET-Farm, says the uncer- 


tainty analysis proved harder than antici- 
pated. “We’ve kind of bumped up against the 
limits of science,” he says. “We’re stumped.” 

Chambers also shared with Science a Word 
document marked “Confidential Draft” de- 
scribing part of the unpublished CSU-led 
project. It shows that DayCent’s results are 
not only uncertain, but also biased in ways 
that exaggerate carbon storage estimates for 
soils containing above a certain amount of 
carbon. Chambers says it could take years to 
understand and correct the bias. 

In the meantime, Ogle says, the bias 
means the climate benefits of regenerative 
practices, though likely greater than zero, 
“may not be as large as we're estimating.” He 
acknowledges that the team has fallen short 
in its reporting of uncertainties to farmers, 
policymakers, and other stakeholders. 
“We need to do better.” 

Marx eventually concluded that the 
CSU team was deliberately obscuring 
the model’s shortcomings. “It doesn’t 
take that long to calculate uncertainty,’ 
he says. In early 2021, he filed a com- 
plaint alleging that the uncertainty re- 
sults had been suppressed, violating 
the university’s and USDA’s research 
integrity policies. He and Paustian 
were interviewed by an investigation 
committee. In his testimony, Paustian 
called Marx’s accusations “utterly false,” 
according to a copy of the committee’s 
report obtained by Science. He main- 
tained that the project proved more 
challenging than expected and argued 
that because carbon storage projects 
created by companies such as Indigo 
are typically spread across many farms, 
quantifying uncertainties for a single 
field is not that relevant. “The uncer- 
tainty at Farmer Jones’s back 40 acres, 
to me that’s actually not really very 
meaningful,” he says. 

Chambers also defends the team. 
“There’s no other model that has done 
this much transparent disclosure of its 
strengths and weaknesses,” he says. Un- 
certainty estimates will be included in an 
updated version of COMET-Farm set to be 
released later this year, Paustian says. 

In October 2021, the university dismissed 
Marx’s complaint. However, in its report, 
the investigation committee chastised the 
modeling team for claiming in the COMET- 
Farm interface that results were “accurate’”— 
a word that has since been removed from 
the website. 


PEOPLE WORKING in soil carbon credit mar- 
kets are aware of the limitations of DayCent 
and COMET-Farm. Radhika Moolgavkar, 
head of supply and methodology at Nori, 
which relies heavily on DayCent, calls the 
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lack of uncertainty estimates “concerning.” 
Cristine Morgan, chief scientific officer at 
the Soil Health Institute, which is develop- 
ing methods to sample and measure car- 
bon in farm fields, says she has not seen a 
model good enough to support a soil carbon 
offset program. “In a transactional world, 
you want certainty, and the models are cur- 
rently very uncertain.” 

Model uncertainty isn’t the only problem 
researchers are finding with the soil carbon 
business. Indigo promises its credited carbon 
will stay locked in the soil for 100 years, off- 
setting fossil fuel emissions that will remain 
in the atmosphere for centuries. That means 
the company assumes farmers will maintain 
regenerative practices for that duration, long 
after annual payments end. Jane Zelikova, 
director of the Soil Carbon Solutions Center 
at CSU, is not persuaded. “The 100-year per- 
manence isn’t real.” 

Indigo says it is doing its best to reduce 
uncertainties in the model. It is attacking 
the problem using a new “Bayesian” method, 
whereby researchers identify the most im- 
portant model parameters and tune them 
based on comparisons of the model’s results 
and experimental data. “What Indigo is do- 
ing is far more sophisticated than what the 
USDA's own product is doing right now,” says 
Michael Dietze, a Boston University eco- 
logist who has reviewed Indigo’s protocols. 
Although the overall approach is “not per- 
fect,’ Dietze says, Indigo’s strategy for merg- 
ing measurements and modeling to provide 
reasonably accurate estimates “makes a lot 
of sense.” 

The company also accounts for uncertain- 
ties by putting 14.5% of the credits farmers 
create into a “buffer pool” rather than sell- 
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ing them, in case natural disasters wipe out 
carbon gains, says A. J. Kumar, Indigo’s vice 
president of sustainability. “If we’ve done 
everything right,’ Kumar says, “the impact 
we've had is much more than we're actually 
creating credits for.” 

And the company is also doing the re- 
search scientists say is needed to put such 
schemes on a firmer footing. A couple of 
hours north of Unger’s place on a farm 
near Arcadia, Indiana, a cover-cropped field 
sat alongside a bare one in April. Shortly 
thereafter, an Indigo-funded crew would de- 
scend on the site to pull out a slew of both 
30-centimeter and meter-long soil cores that 
the company hopes will reveal how much 
faster carbon is building up in the cover- 
cropped field. Meanwhile, a solar-powered 
flux tower measures CO, wafting into and 
out of soil every 100 milliseconds. It’s all part 
of an ambitious soil carbon experiment that 
Indigo launched in 2019, alongside its offset 
program. And it will soon have a new ally: 
In July, USDA announced it would invest 
$300 million in a network of soil-monitoring 
sites across U.S. farmland—something re- 
searchers such as Ogle and Paustian have 
long pleaded for. 

Indigo’s leaders say these studies will add 
confidence, but waiting years for results 
to come in before launching their mar- 
ketplace would have meant a disastrous 
delay. “Are we going to wait for the planet 
to catch on fire?” asks Chris Harbourt, In- 
digo’s chief strategy officer. “We need solu- 
tions right now to reverse climate change.” 
In December 2022, Indigo said it had paid 
more than 400 U.S. farmers $3.7 million 
for regenerative practices implemented 
on more than 170,000 hectares. Next year, 


it hopes to ramp up to 2.2 million hect- 
ares under contract. The chemical and 
seed giant Corteva has enrolled more than 
400,000 hectares farmed by its customers 
in Indigo’s program. 


IN ORDER TO ultimately succeed, Indigo and 
other soil carbon capture programs will need 
more than the blessing of researchers. They 
will also need thousands of farmers willing 
to rethink practices that in many cases go 
back generations. Early adopters like Unger 
provide notes of both optimism and caution. 

Although submitting years of farm data to 
Indigo was a pain, Unger says he has been 
happy working with the company so far. 
The payments have helped him stop tilling 
and plant cover crops on some of his lower 
quality fields—changes he wanted to make 
anyway. But Indigo doesn’t pay enough to 
induce him to mess with his best fields. “If 
it aligns with what I’m trying to already do 
and they want to pay me for it, more power 
to them,” Unger says. “But I’m not going to 
risk my future and my kids’ future.” 

And like some researchers, he cautions 
against hyping regenerative agriculture too 
much. Ideas that work well on paper can be 
upended by the vagaries of weather, mar- 
kets, and other unpredictable factors that 
farmers face. 

“Telling people what to do on a farm when 
you sit in an office 1000 miles away from 
them is pretty easy. But if you’re the one out 
here making decisions every day ... that’s not 
the world we live in,” Unger says. 

“To say that every year a guy plants cover 
crops, that it’s going to be profitable for you, 
it’s going to work out—that’s a pipe dream 
right there.” & 
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A shared foundation of language change 


Short-term development and long-term evolution of language share mechanisms 


By Simon J. Greenhill 


s the world changes, humans en- 
counter new things that need to be 
described using a finite set of words. 
A common strategy for labeling these 
novelties is to reuse existing words— 
i.e, word meaning extension. For 
example, “mouse” can refer to a computer 
control device. Children also creatively 
overextend word meanings as they learn 
their languages. The need to name novel- 
ties has been present during the evolution 
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of language, often resulting in the use of 
one word to express two different mean- 
ings. For example, Russian labels (colexi- 
fies) both “tree” and “wood” with “derevo” 
(1); this is a common pattern worldwide (2). 
On page 431 of this issue, Brochhagen et al. 
(3) present evidence that the overextension 
process that gives rise to new terms in the 
short term is the same process that gener- 
ates patterns between languages over the 
long term. Their findings have major im- 
plications for the study of language change 
over evolutionary history. 


Brochhagen et al. used three different da- 
tasets: a database of common errors made 
by children when learning to communicate, 
a database of documented historical seman- 
tic shifts (for example, the meaning of “viral” 
recently shifted from “caused by a virus” to 
“spreading widely”), and a global database 
containing information about which words 
colexify in different languages. These data- 
sets provide information on language change 
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As they learn to speak, children often use words 
that they have already learned to name new objects, 
in a process called word meaning extension. 


at three different scales: within children, 
within languages, and between languages. 

Brochhagen et al. investigated how pairs 
of words found in these datasets are pre- 
dicted by four types of knowledge: associa- 
tivity (whether the words are semantically 
similar), visual similarity (e.g., a computer 
mouse is more similar to a rodent than an ice 
cream), taxonomic similarity (e.g., a mouse is 
closer to a cat than an ice cream), and affec- 
tive similarity (e.g., ice creams are associated 
with happiness, mice less so). In all three 
datasets, the biggest driver of word pairing 
was associativity, followed by taxonomic and 
then visual similarity. 

The authors found that taking a model 
trained on one dataset and applying it to an- 
other dataset explained word pairings in the 
second dataset almost as well as in the first. 
This suggests that there is a shared founda- 
tion underlying word meanings. Brochhagen 
et al. argue that this commonality is not 
an outcome of childhood errors becoming 
adulthood norms. Instead, they argue that 
there is an underlying common foundation— 
linguistic creativity. That is, both children 
and adults use their rich knowledge of the 
world and the objects in it to label new enti- 
ties on the basis of their similarity to things 
they already know. It is this creativity that 
could cause the patterns of word meaning 
extension during childhood development 
(ontogeny) to recapitulate those in language 
evolution (phylogeny). 

The analysis of Brochhagen et al. adds to 
recent studies showing that small-scale pro- 
cesses can have a substantial effect on lan- 
guage at a larger scale. For example, words 
that are more common (4) or have grammati- 
cal features that are more abstract (5) in daily 
speech tend to be those that evolve more 
slowly in the long term. Similarly, evidence 
indicates that low-level cognitive biases, such 
as a preference for interpreting noun phrases 
as the agent of a sentence, may have shaped 
the global patterns of language diversity (6). 

Taken together, these findings have impor- 
tant implications for investigating language. 
The influences of linguistic creativity, usage, 
and cognition on language change are oper- 
ating at a small scale—for example, within 
the brain or within a community with a 
common language—but accumulate to gen- 
erate large-scale global patterns of linguistic 
diversity between languages and over time. 
Therefore, a theory of language change and 
evolution is needed that links the processes 
operating within individuals over millisec- 
onds to those operating as children learn 
language and to those operating within and 
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between communities over centuries. 

This task will not be easy. Complex adap- 
tive systems such as language require com- 
plex adaptive explanations operating at dif- 
ferent scales (7). There are some promising 
signs that linguistics is heading toward this 
more-comprehensive framework. For exam- 
ple, there are an increasing number of theo- 
retical attempts to connect processes across 
timescales (8). Researchers are interrogating 
whether language evolution conforms to 
predictions from general evolutionary the- 
ory or whether new theoretical constructs 
are required (9). Others have suggested ex- 
periments to test these predictions (J0). In 
addition, there are an increasing number of 
large databases of primary language data 
from across the globe that enable research- 
ers to ask questions about language change 
across multiple timescales (/, 17). Linguistics 
as a whole is also undergoing a shift toward 
the use of more-robust quantitative methods 
(72), which will enable the application of 
powerful analytical tools to these data. 

Combining these tools, data, and ideas 
will connect the processes causing change 
at different timescales and enable the iden- 
tification of key causal pathways that have 
shaped humanity’s linguistic diversity over 
time and across the globe. There are many 
exciting topics to explore in this space. For 
example, how do cognitive biases affect 
learning of the different languages found 
around the world? It will also be interesting 
to ask how learning interacts with language 
systems that are configured in different ways 
and how repeated pathways like grammati- 
calization (by which words representing ob- 
jects and actions become grammatical mark- 
ers) are affected by language acquisition and 
evolution. Quantifying sociolinguistic aware- 
ness of linguistic systems across languages 
and evaluating how these interact with the 
formation of social groups would also be an 
interesting area for future research. 
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FRACTURE MECHANICS 


Cracks break 
the sound 
barrier 


Experiments show that 
tensile cracks can travel 
above the speed of sound 


By Michael Marder 


racks at scales too small to see per- 

meate most solid objects, and they 

are dangerous when they grow and 

rip things apart. Thus, the study of 

crack dynamics is an important part 

of fracture mechanics—the disci- 
pline that explains the stress that cracked 
materials can sustain before they give way. 
This understanding is essential for appli- 
cations ranging from airplane safety to 
earthquake detection and prediction. For 
many decades, there has been a consensus 
on the speed limit to crack propagation in 
a body pulled apart in tension. The limit 
is the speed at which sound travels across 
a free surface, called the Rayleigh wave 
speed. On page 415 of this issue, Wang et 
al. (1) report that the Rayleigh wave speed 
is not the limit after all; cracks can travel 
at the speed of sound and beyond. 

Cracks have long been easiest to under- 
stand when studied through a combina- 
tion of experiments in model materials and 
mathematical analysis. An early study of 
this type, in 1921, involved cracks in glass 
(2). It showed that the motion of a crack in- 
volves the interplay of two factors. When a 
crack extends, it relieves stress and recov- 
ers stored elastic potential energy. However, 
energy must be spent to pull atoms apart 
and rupture the material. A solid under 
stress is said to reach the Griffith point 
when these two factors exactly balance; 
if more stress is applied, the extra energy 
induces the crack movement. But how fast 
can the crack travel? A precise calculation 
of crack dynamics was achieved 30 years 
later, in 1951 (3), through an exact solution 
for a moving crack described as a sum of 
surface waves. It stands to reason that the 
speed of crack propagation is limited by the 
fastest surface wave. The solutions to the 
crack dynamics equation become singular; 
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As they learn to speak, children often use words 
that they have already learned to name new objects, 
in a process called word meaning extension. 


at three different scales: within children, 
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Brochhagen et al. investigated how pairs 
of words found in these datasets are pre- 
dicted by four types of knowledge: associa- 
tivity (whether the words are semantically 
similar), visual similarity (e.g., a computer 
mouse is more similar to a rodent than an ice 
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closer to a cat than an ice cream), and affec- 
tive similarity (e.g., ice creams are associated 
with happiness, mice less so). In all three 
datasets, the biggest driver of word pairing 
was associativity, followed by taxonomic and 
then visual similarity. 

The authors found that taking a model 
trained on one dataset and applying it to an- 
other dataset explained word pairings in the 
second dataset almost as well as in the first. 
This suggests that there is a shared founda- 
tion underlying word meanings. Brochhagen 
et al. argue that this commonality is not 
an outcome of childhood errors becoming 
adulthood norms. Instead, they argue that 
there is an underlying common foundation— 
linguistic creativity. That is, both children 
and adults use their rich knowledge of the 
world and the objects in it to label new enti- 
ties on the basis of their similarity to things 
they already know. It is this creativity that 
could cause the patterns of word meaning 
extension during childhood development 
(ontogeny) to recapitulate those in language 
evolution (phylogeny). 

The analysis of Brochhagen et al. adds to 
recent studies showing that small-scale pro- 
cesses can have a substantial effect on lan- 
guage at a larger scale. For example, words 
that are more common (4) or have grammati- 
cal features that are more abstract (5) in daily 
speech tend to be those that evolve more 
slowly in the long term. Similarly, evidence 
indicates that low-level cognitive biases, such 
as a preference for interpreting noun phrases 
as the agent of a sentence, may have shaped 
the global patterns of language diversity (6). 

Taken together, these findings have impor- 
tant implications for investigating language. 
The influences of linguistic creativity, usage, 
and cognition on language change are oper- 
ating at a small scale—for example, within 
the brain or within a community with a 
common language—but accumulate to gen- 
erate large-scale global patterns of linguistic 
diversity between languages and over time. 
Therefore, a theory of language change and 
evolution is needed that links the processes 
operating within individuals over millisec- 
onds to those operating as children learn 
language and to those operating within and 
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between communities over centuries. 

This task will not be easy. Complex adap- 
tive systems such as language require com- 
plex adaptive explanations operating at dif- 
ferent scales (7). There are some promising 
signs that linguistics is heading toward this 
more-comprehensive framework. For exam- 
ple, there are an increasing number of theo- 
retical attempts to connect processes across 
timescales (8). Researchers are interrogating 
whether language evolution conforms to 
predictions from general evolutionary the- 
ory or whether new theoretical constructs 
are required (9). Others have suggested ex- 
periments to test these predictions (J0). In 
addition, there are an increasing number of 
large databases of primary language data 
from across the globe that enable research- 
ers to ask questions about language change 
across multiple timescales (/, 17). Linguistics 
as a whole is also undergoing a shift toward 
the use of more-robust quantitative methods 
(72), which will enable the application of 
powerful analytical tools to these data. 

Combining these tools, data, and ideas 
will connect the processes causing change 
at different timescales and enable the iden- 
tification of key causal pathways that have 
shaped humanity’s linguistic diversity over 
time and across the globe. There are many 
exciting topics to explore in this space. For 
example, how do cognitive biases affect 
learning of the different languages found 
around the world? It will also be interesting 
to ask how learning interacts with language 
systems that are configured in different ways 
and how repeated pathways like grammati- 
calization (by which words representing ob- 
jects and actions become grammatical mark- 
ers) are affected by language acquisition and 
evolution. Quantifying sociolinguistic aware- 
ness of linguistic systems across languages 
and evaluating how these interact with the 
formation of social groups would also be an 
interesting area for future research. 


REFERENCES AND NOTES 


1. C.Rzymskietal., Sci. Data 7,13 (2020). 

2. A.Schapper, L.S.Roque, R. Hendery, in The Lexical 
Typology of Semantic Shifts, P. Juvonen, M. Koptjevskaja- 
Tamm, Eds. (De Gruyter Mouton, 2016), pp.355-422. 

3. T.Brochhagen, G. Boleda, E. Gualdoni, Y. Xu, Science 381, 
431 (2023). 

4. A.S.Calude, M. Pagel, Phil. Trans. R. Soc. B 366, 1101 
(2011). 

5. S.J.Greenhill etal., Proc. Natl. Acad. Sci. U.S.A.114, 
E8822 (2017). 

6. B. Bickel, A. Witzlack-Makarevich, K.K. Choudhary, M. 

Schlesewsky, |. Bornkessel-Schlesewsky, PLOS ONE 10, 

e0132819 (2015). 

P.W. Anderson, Science 177, 393 (1972). 

W. Labov, Language 83, 344 (2007). 

D. Dediu et al., in Cultural Evolution: Society, Technology, 

Language, and Religion, P.J.Richerson, M.H. 

Christiansen, Eds. (MIT Press, 2013), pp. 303-332. 

10. G.Roberts, B. Sneller, Lang. Dyn. Change 10, 188 (2020). 

lL. H.Skirgardetal., Sci. Adv. 9, eadg6175 (2023). 

12. B.Kortmann, Linguistics 59,1207 (2021). 


10.1126/science.adj2154 


coon 


FRACTURE MECHANICS 


Cracks break 
the sound 
barrier 


Experiments show that 
tensile cracks can travel 
above the speed of sound 


By Michael Marder 


racks at scales too small to see per- 

meate most solid objects, and they 

are dangerous when they grow and 

rip things apart. Thus, the study of 

crack dynamics is an important part 

of fracture mechanics—the disci- 
pline that explains the stress that cracked 
materials can sustain before they give way. 
This understanding is essential for appli- 
cations ranging from airplane safety to 
earthquake detection and prediction. For 
many decades, there has been a consensus 
on the speed limit to crack propagation in 
a body pulled apart in tension. The limit 
is the speed at which sound travels across 
a free surface, called the Rayleigh wave 
speed. On page 415 of this issue, Wang et 
al. (1) report that the Rayleigh wave speed 
is not the limit after all; cracks can travel 
at the speed of sound and beyond. 

Cracks have long been easiest to under- 
stand when studied through a combina- 
tion of experiments in model materials and 
mathematical analysis. An early study of 
this type, in 1921, involved cracks in glass 
(2). It showed that the motion of a crack in- 
volves the interplay of two factors. When a 
crack extends, it relieves stress and recov- 
ers stored elastic potential energy. However, 
energy must be spent to pull atoms apart 
and rupture the material. A solid under 
stress is said to reach the Griffith point 
when these two factors exactly balance; 
if more stress is applied, the extra energy 
induces the crack movement. But how fast 
can the crack travel? A precise calculation 
of crack dynamics was achieved 30 years 
later, in 1951 (3), through an exact solution 
for a moving crack described as a sum of 
surface waves. It stands to reason that the 
speed of crack propagation is limited by the 
fastest surface wave. The solutions to the 
crack dynamics equation become singular; 
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that is, they take infinite values as the crack 
approaches this speed, which corresponds 
to the Rayleigh wave speed (4). 

According to current understanding (5, 
6), the limit of crack speed propagation 
is explained as a consequence of energy 
transport. The linear elastic theory of dy- 
namic fracture states that one can draw 
a loop around the tip of a moving crack 
and compute the energy passing through 
the loop. When the crack tip reaches the 
Rayleigh wave speed, the energy expres- 
sion approaches infinity; past the Rayleigh 
wave speed it becomes negative, and then 
at slightly higher speeds it becomes imagi- 
nary. Negative energy from a crack would 
make perpetual motion possible, and imagi- 
nary energy makes no sense; these are both 
violations of the laws of physics, so such 
cracks were assumed to be impossible. 

An exception to this assumption has 
been known for some time. It was dem- 
onstrated in 1976 (7) that when cracks 
are driven in shear (the forces driving the 
crack are parallel to the crack), there is a 
special velocity above the Rayleigh wave 
speed at which energy expressions become 
finite again. Some researchers (8) found 
supersonic cracks of this type in the lab, 
and others obtained them in simulations 
(9). Earthquakes can be cracks of this type 
too, which explains field observations of 
supersonic earthquakes. 

Thus, it was puzzling when cracks faster 
than the Rayleigh wave speed were ob- 
served in experiments carried out on rub- 
ber under tension (J0). This was the sce- 
nario that the case of imaginary energies 
was supposed to forbid. One explanation 
put forward to resolve the difficulty was 
that near the tip of the crack, the speed of 
sound increases (JJ). Another possible ex- 
planation came from the dynamic theory 
for cracks in crystalline lattices, which 
found that once the discrete atomic nature 
of solids is treated explicitly in fracture 
theory, cracks can become supersonic with- 
out needing any increase in wave speed 
(12-14). But these findings failed to create 
a consensus that supersonic cracks under 
tension exist. Perhaps there was something 
peculiar about rubber, or the elastic theory 
that describes rubber, or lattice models. 
This is where community consensus rested 
for many years. 

Now, Wang e¢ al. have conducted labo- 
ratory experiments in a model brittle ma- 
terial, a polymer gel, where sound speeds 
are low and cracks are easy to follow. They 
carefully studied subsonic cracks in their 
samples and showed that the cracks obey 
in all detail predictions of the linear elastic 
theory of dynamic fracture. Then they pull 
harder and harder on the material and the 
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Supersonic crack in a lattice 

In the linear elastic theory of dynamic fracture, cracks 
have rounded tips and move because energy flows 
into their tips. Supersonic cracks, driven by pulling 
hard on materials weakened along a plane, look 
different, with a wedge-like crack tip and Mach cones. 
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cracks accelerate, reaching and surpassing 
the Rayleigh wave speed. 

A polymer gel is far from a regular 
crystalline lattice. Nevertheless, the ex- 
periments of Wang et al. act in many re- 
spects like supersonic cracks in lattices. 
Both systems display wedge-shaped tips 
surrounded by Mach cones, which refer to 
the shock waves that form around all su- 
personic objects, including aircraft, where 
the shocks create sonic booms (see the 
figure). Both in theory and in Wang’s ex- 
periments, the speed of crack propagation 
depends on how much material in front of 
the crack has been stretched, rather than 
on how much energy is stored ahead of 
the crack as in the linear elastic theory of 
dynamic fracture. The polymer gel experi- 
ment shows no signs of rising wave speeds 
near the tip as proposed previously (71). 

Thus, it appears there is anew domain of , 
crack motion conventionally thought until 
now not to exist, where cracks under ten- 
sion travel faster than the speed of sound. 
A necessary condition for such cracks to 
exist is that the tip must remain stable at 
high speeds—that is, the tip must keep from 
splitting, swerving, branching, or blunting. 
The new experiments by Wang et al. stabi- 
lize crack tips by weakening the plane along 
which the cracks travel. However, many 
questions about supersonic cracks are not 
resolved. It is not certain whether they can 
exist in all materials, or just special ones, 
and which materials’ properties would need 
to be present. These are just some of the 
problems to solve next. & 
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Trimming away tau in neurodegeneration 


Tau quality control by tripartite motif 11 (TRIM11) protects neurons in mice 


By Wendy Noble? and Diane P. Hanger? 


iological quality control systems elim- 
inate potentially harmful misfolded 
proteins and prevent them from dis- 
rupting cell homeostasis. Defective 
protein quality control in neurode- 
generative diseases, including tauopa- 
thies such as Alzheimer’s disease (AD), leads 
to the accumulation of misfolded proteins 
that drive pathology (7). However, the impor- 
tance of these systems to the pathogenesis of 
disease and therefore their utility as thera- 
peutic targets are not known. On page 
413 of this issue, Zhang et al. (2) iden- 
tify tripartite motif 11 (TRIM1]1) as a key 
regulator of tau aggregate accumulation 
in neurodegenerative disease. Their 
work highlights TRIM11 as a potential 
therapeutic target for ameliorating tau- 
associated neurodegeneration. 

TRIMs constitute a superfamily of 
E3 ubiquitin ligases that act through 
several different pathways to regulate 
the turnover of functional and mis- 
folded proteins. Multiple TRIMs have 
been identified as modifiers of pro- 
tein aggregation in neurodegenerative 
diseases (3). TRIMs are structurally 
defined by the presence of several do- 
mains, including coiled-coil domains 
in the N terminus that allow TRIM 
proteins to self-associate and define 
the specific interactions and functions 
of individual TRIM family members 
(3). Alongside variation in other func- 
tional domains and motifs, this struc- 
tural heterogeneity allows specific 
TRIMs to play quite different roles in 
health and disease. 

In a healthy brain, tau is primarily a 
soluble protein found in the cytoplasm. 
It is vital to fundamental functions in- 
cluding axonal transport, cell signaling, and 
cytoskeletal support. In AD, progressive su- 
pranuclear palsy (PSP), and specific forms 
of frontotemporal dementia (FTD) caused 
by mutations in the gene encoding tau, tau 
misfolds, forms oligomers, and becomes pro- 
gressively insoluble, resulting in its aggrega- 
tion into filaments with well-defined struc- 
tures (4). The accumulation and propagation 
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of misfolded tau in the brain has detrimental 
effects, including the disruption of synapses 
and neural networks, sustained neuroin- 
flammation, and altered neuron-glial in- 
teractions, alongside impaired proteostasis. 
The loss of functional soluble tau also adds 
to the damage by disrupting axonal trans- 
port and cell signaling (J). 

Several factors are believed to contribute 
to the development of misfolded tau aggre- 
gates, including aberrant tau phosphoryla- 
tion, tau cleavage, disrupted cellular trans- 
port, impaired protein clearance, and rare 


Tripartite motif 11 disaggregates 
and degrades misfolded tau 


In Alzheimer’s disease and other tauopathies, tau protein 
misfolds and forms oligomers, which clump together to form 
filamentous aggregates. Tripartite motif 11 (TRIM11) 
breaks up these aggregates and also facilitates the proteasomal 
degradation of misfolded tau. 
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mutations in FTD that affect tau splicing and 
aggregation. Targeting these mechanisms to 
ameliorate tau-associated neurodegenera- 
tion has shown promise in preclinical mod- 
els. Tau disaggregating agents are posited by 
some as a favored therapeutic approach and 
are currently in phase 3 trials for AD (5). 
Zhang et al. showed that TRIM11 levels 
are reduced in brains from individuals with 
AD. This loss of TRIM11 in human AD brain 
may result from interrupted transcrip- 
tion of TRIM in the presence of intronic 
TRIMII variants such as rs564309, which 
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is associated with higher burden of tau pa- 
thology and disease progression in PSP (6). 
In non-neuronal human embryonic kidney 
293T (HEK293T) cells that were grown in 
vitro and expressed FTD-causing forms of 
tau, Zhang et al. found that TRIM11 was 
a potent tau disaggregase, increasing the 
available pool of soluble, functional tau. 
In mouse neurons harboring aggregates 
of FTD-associated mutant tau, TRIMI11 
preserved neuronal connectivity. Notably, 
TRIM11 also promoted the proteasomal 
degradation of misfolded and excess solu- 
ble tau in the mutant tau expressing 
HEK293T cells, maintaining a healthy 
equilibrium of functional tau and re- 
ducing tau aggregate seeding (see the 
figure). This is an important property 
of TRIM11 because the disaggregation 
of filamentous tau might release tau 
oligomers, which are toxic. 

Enhanced expression of TRIMI1 
protects from dopaminergic neuron 
loss in models of Parkinson’s disease 
by promoting removal of aggregated 
a-synuclein, the major protein constit- 
uent of pathological Lewy bodies (7). 
In models of Huntington’s disease and 
spinocerebellar ataxia type 1, TRIM11 
enhances the degradation of mutant 
forms of huntingtin and ataxin-l, 
which form aggregates in the disease 
state (8, 9). These findings reflect the 
more widespread potential of TRIM11 
as a neuroprotective agent. However, 
because tau is not a specific substrate 
of TRIM11, it is important to consider 
that other interactions of TRIM11 may 


counteract the protection conferred . 


by the removal of tau aggregates. For 
example, TRIM11 also binds to and 
degrades humanin (JO), a mitochon- 
drially encoded polypeptide that is 
believed to be neuroprotective in AD (11). 
More widely, TRIM11 is involved in the re- 
moval of misfolded proteins in several can- 
cers. Increased TRIM11 expression might 
partly explain the increased protein degra- 
dative capacity of tumor cells, including hu- 
man breast cancer cells (9). By accelerating 
the proteasomal clearance of p53 and other 
tumor suppressors, TRIM11 promotes tu- 
mor progression, and high levels of TRIM11 
expression correlate with reduced survival 
in human colon cancer (12). TRIM11 is also 
associated with increased resistance to can- 
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cer therapeutics such as cisplatin and pa- 
clitaxel (13). These potentially detrimental 
effects of TRIM11 mean that careful moni- 
toring of localized TRIM11 interactions is 
likely to be needed before TRIM11 can be 
considered a viable target for therapy in 
neurodegenerative diseases. 

How could TRIMI11 expression be in- 
creased therapeutically? It may be possible 
to regulate positive downstream or nega- 
tive upstream regulators of TRIM11 phar- 
macologically. For example, nuclear factor 
erythroid 2-related factor 2 (NRF2) is a key 
transcriptional regulator of TRIMs, includ- 
ing TRIM11. Reduction of NRF2 expression 
decreases the transcription and protein lev- 
els of TRIM11 (9) and impairs degradation of 
various protein aggregates in cell models (9). 
However, there are considerable challenges 
in developing safe and effective NRF2 activa- 
tors (14), because sustained NRF2 activation 
is linked to hepatotoxicity in humans and tu- 
mor progression in mice (/4). Therefore, the 
most obvious option is to increase TRIM11 
protein expression genetically. Zhang et al. 


“4 potential therapeutic target 
for ameliorating tau-associated 
neurodegeneration.” 


used intracranial delivery of adeno-asso- 
ciated viruses to increase TRIMI1 protein 
levels in several different mouse models of 
tauopathy. This reduced the burden of tau 
pathology and neuroinflammation and, im- 
portantly, restored cognition. As with all 
gene therapies, there needs to be careful ti- 
tration of expression levels because overex- 
pression of genes is commonly carcinogenic 
and has the potential to induce a damaging 
immune response. 
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A step toward stem cell 
engineering in vivo 


mRNA-based delivery may change the paradigm 
of hematopoietic stem cell gene therapy 


By Samuele Ferrari! and Luigi Naldini? 


ematopoietic stem cell (HSC) gene 

therapy provides lifelong and sub- 

stantial benefits for several life- 

threatening inherited diseases, such 

as primary immunodeficiencies, 

storage disorders, and hemoglobin- 
opathies (7). Currently, HSC gene therapy 
requires harvesting large numbers of a pa- 
tient’s hematopoietic stem and progenitor 
cells (HSPCs), which undergo gene transfer 
or editing ex vivo. Before infusion, the cell 
product is qualified to ensure that it meets 
rigorous safety and efficacy standards, and 
the patient undergoes conditioning che- 
motherapy to deplete endogenous HSPCs 
and make space for the engineered cells 
to engraft in the bone marrow. However, 
the need for laborious manufacturing and 
the toxicity associated with the condition- 
ing regimen limits the broad application of 
these life-saving treatments. On page 436 of 
this issue, Breda e¢ al. (2) provide a proof of 
principle of in vivo genetic engineering of 
HSPCs in the bone marrow of mice by lever- 
aging transient delivery of mRNA through 
lipid nanoparticles (LNPs) functionally cou- 
pled to antibodies that target HSPCs. 

The LNP-mRNA technology used by 
Breda et al. stems from the advances made 
on mRNA-based vaccines against COVID-19 
(3). Incorporation of modified bases during 
mRNA synthesis, efficient 5’ capping, and 
stringent purification of the desired full- 
length product abrogate detrimental innate 
immune responses to exogenous nucleic ac- 
ids. These modifications also improve sta- 
bility and expression proficiency, turning 
the highly labile nature of natural mRNA 
into a versatile scaffold for transient and ro- 
bust transgene expression without concern 
for stable insertion into the genome. This 
mRNA is then encapsidated within LNPs, 
whose small and homogeneous size, surface 
features that resemble cellular membranes, 
potential for manipulating their composi- 
tion, and functional coupling to cell-tar- 


1San Raffaele Telethon Institute for Gene Therapy, Istituto 
di Ricovero e Cura a Carattere Scientifico (IRCCS) San 
Raffaele, Milan, Italy. Vita-Salute San Raffaele University, 
Milan, Italy. Email: naldini.luigi@hsr.it 


geting moieties allow for in vivo adminis- 
tration and preferential discharge of their 
cargo into some tissues or cells of choice. 

Breda et al. generated LNP-mRNA deco- 
rated with antibodies directed against the 
HSPC membrane receptor c-KIT (also called 
CD117). They observed effective delivery of 
the prokaryotic site-specific Cre recombi- 
nase in mouse HSPCs in vivo, providing 
evidence for stable gene editing. Moreover, 
they delivered a pro-apoptotic molecule to 
deplete mouse HSPCs in vivo, as a method 
of nongenotoxic conditioning. Although the 
results reported by Breda et al. are limited 
to mice and they used a surrogate editing 
tool (Cre) that lacks therapeutic use, they 
provide a promising glimpse into a future 
when in vivo engineering by means of tran- 
sient delivery of mRNA-encoding editing 
tools such as CRISPR will circumvent the 
limitations of ex vivo HSC gene therapy (see 
the figure). Moreover, in vivo modulation of 
biological functions in resident HSCs could 
pave the way to new discoveries. Notably, 
there are other approaches under devel- 
opment to genetically modify HSPCs in 
vivo, which involve mobilizing cells from 
the bone marrow and transducing them in 
the circulation with viral vectors such as 
adenoviruses that express the editing tool. 
Because efficiency is low, these approaches 
are combined with selection strategies to 
expand the edited cells (4). 

Future implementation of in vivo HSPC 
engineering by use of LNP-mRNA in hu- 
mans will require further preclinical stud- 
ies to confirm the portability of the find- 
ings of Breda et al. to disease settings, 
nonhuman primates, and humanized bone 
marrow niches. It is conceivable that such 
HSPC engineering approaches will require 
next-generation targeted LNPs that have in- 
creased bioavailability to HSPCs in the bone 
marrow and improved capacity to encapsid- 
ate complex RNA payloads for efficient co- 
expression of editing components. Multiple 
rounds of LNP-mRNA administration might 
compensate for limited efficiency of in vivo 
HSC gene editing, but immune responses 
against the payload may lead to adverse re- 
actions and clearance of transfected cells. 
Controlling LNP-mRNA biodistribution 
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cer therapeutics such as cisplatin and pa- 
clitaxel (13). These potentially detrimental 
effects of TRIM11 mean that careful moni- 
toring of localized TRIM11 interactions is 
likely to be needed before TRIM11 can be 
considered a viable target for therapy in 
neurodegenerative diseases. 

How could TRIMI11 expression be in- 
creased therapeutically? It may be possible 
to regulate positive downstream or nega- 
tive upstream regulators of TRIM11 phar- 
macologically. For example, nuclear factor 
erythroid 2-related factor 2 (NRF2) is a key 
transcriptional regulator of TRIMs, includ- 
ing TRIM11. Reduction of NRF2 expression 
decreases the transcription and protein lev- 
els of TRIM11 (9) and impairs degradation of 
various protein aggregates in cell models (9). 
However, there are considerable challenges 
in developing safe and effective NRF2 activa- 
tors (14), because sustained NRF2 activation 
is linked to hepatotoxicity in humans and tu- 
mor progression in mice (/4). Therefore, the 
most obvious option is to increase TRIM11 
protein expression genetically. Zhang et al. 


“4 potential therapeutic target 
for ameliorating tau-associated 
neurodegeneration.” 


used intracranial delivery of adeno-asso- 
ciated viruses to increase TRIMI1 protein 
levels in several different mouse models of 
tauopathy. This reduced the burden of tau 
pathology and neuroinflammation and, im- 
portantly, restored cognition. As with all 
gene therapies, there needs to be careful ti- 
tration of expression levels because overex- 
pression of genes is commonly carcinogenic 
and has the potential to induce a damaging 
immune response. 
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A step toward stem cell 
engineering in vivo 


mRNA-based delivery may change the paradigm 
of hematopoietic stem cell gene therapy 


By Samuele Ferrari! and Luigi Naldini? 


ematopoietic stem cell (HSC) gene 

therapy provides lifelong and sub- 

stantial benefits for several life- 

threatening inherited diseases, such 

as primary immunodeficiencies, 

storage disorders, and hemoglobin- 
opathies (7). Currently, HSC gene therapy 
requires harvesting large numbers of a pa- 
tient’s hematopoietic stem and progenitor 
cells (HSPCs), which undergo gene transfer 
or editing ex vivo. Before infusion, the cell 
product is qualified to ensure that it meets 
rigorous safety and efficacy standards, and 
the patient undergoes conditioning che- 
motherapy to deplete endogenous HSPCs 
and make space for the engineered cells 
to engraft in the bone marrow. However, 
the need for laborious manufacturing and 
the toxicity associated with the condition- 
ing regimen limits the broad application of 
these life-saving treatments. On page 436 of 
this issue, Breda e¢ al. (2) provide a proof of 
principle of in vivo genetic engineering of 
HSPCs in the bone marrow of mice by lever- 
aging transient delivery of mRNA through 
lipid nanoparticles (LNPs) functionally cou- 
pled to antibodies that target HSPCs. 

The LNP-mRNA technology used by 
Breda et al. stems from the advances made 
on mRNA-based vaccines against COVID-19 
(3). Incorporation of modified bases during 
mRNA synthesis, efficient 5’ capping, and 
stringent purification of the desired full- 
length product abrogate detrimental innate 
immune responses to exogenous nucleic ac- 
ids. These modifications also improve sta- 
bility and expression proficiency, turning 
the highly labile nature of natural mRNA 
into a versatile scaffold for transient and ro- 
bust transgene expression without concern 
for stable insertion into the genome. This 
mRNA is then encapsidated within LNPs, 
whose small and homogeneous size, surface 
features that resemble cellular membranes, 
potential for manipulating their composi- 
tion, and functional coupling to cell-tar- 
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geting moieties allow for in vivo adminis- 
tration and preferential discharge of their 
cargo into some tissues or cells of choice. 

Breda et al. generated LNP-mRNA deco- 
rated with antibodies directed against the 
HSPC membrane receptor c-KIT (also called 
CD117). They observed effective delivery of 
the prokaryotic site-specific Cre recombi- 
nase in mouse HSPCs in vivo, providing 
evidence for stable gene editing. Moreover, 
they delivered a pro-apoptotic molecule to 
deplete mouse HSPCs in vivo, as a method 
of nongenotoxic conditioning. Although the 
results reported by Breda et al. are limited 
to mice and they used a surrogate editing 
tool (Cre) that lacks therapeutic use, they 
provide a promising glimpse into a future 
when in vivo engineering by means of tran- 
sient delivery of mRNA-encoding editing 
tools such as CRISPR will circumvent the 
limitations of ex vivo HSC gene therapy (see 
the figure). Moreover, in vivo modulation of 
biological functions in resident HSCs could 
pave the way to new discoveries. Notably, 
there are other approaches under devel- 
opment to genetically modify HSPCs in 
vivo, which involve mobilizing cells from 
the bone marrow and transducing them in 
the circulation with viral vectors such as 
adenoviruses that express the editing tool. 
Because efficiency is low, these approaches 
are combined with selection strategies to 
expand the edited cells (4). 

Future implementation of in vivo HSPC 
engineering by use of LNP-mRNA in hu- 
mans will require further preclinical stud- 
ies to confirm the portability of the find- 
ings of Breda et al. to disease settings, 
nonhuman primates, and humanized bone 
marrow niches. It is conceivable that such 
HSPC engineering approaches will require 
next-generation targeted LNPs that have in- 
creased bioavailability to HSPCs in the bone 
marrow and improved capacity to encapsid- 
ate complex RNA payloads for efficient co- 
expression of editing components. Multiple 
rounds of LNP-mRNA administration might 
compensate for limited efficiency of in vivo 
HSC gene editing, but immune responses 
against the payload may lead to adverse re- 
actions and clearance of transfected cells. 
Controlling LNP-mRNA biodistribution 
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The present and future of hematopoietic stem cell engineering 


Ex vivo hematopoietic stem cell (HSC) engineering requ 


ires conditioning chemotherapy to deplete resident 


hematopoietic stem and progenitor cells (HSPCs) before infusion of modified cells that engraft in the bone 
marrow. In vivo HSC engineering with lipid nanoparticle (LNP)—encapsidated mRNA can target HSPCs through 


antibody-mediated recognition of surface antigens (e.g., 


exploited to deliver apoptotic mRNAs to deplete HSPCs 
bone marrow niche. 


Ex vivo HSC engineering 


cir Z { ane 


dea 
e Adipocyte 


based conditioning 


st Engraftment 


Bone 
marrow 


Pros 
Highly efficient 
Clinically validated 


Systematic quality check of 
the engineered cell product 
before infusion 


vessel 


Cons 


aoe Osteoclast pet Osteocyte ie 


Apoptotic 


CD117 antibodies recognize c-KIT). This could be 
and to directly engineer HSPCs when they are in the 


In vivo HSC engineering 


HSC 


Mesenchymal 
stromal cell 


Pros 
Likely less costly 


Simple administration and 
manufacturing 


Does not require collection or 
ex vivo culture 


Enables HSC modification in their niche 


CD117 
Expensive antibody Cons 
Involves toxic i i Poor control of the engineered 
conditioning regimen ped cell product 
Requires complex ' Further optimization is needed 
manufacturing processes Buwine Targeted Potential bystander 
Impact of ex vivo culture engineered | ipid ‘| modification of other cells 
on cell phenotype HSCs + nanoparticles Liver toxicity? Immunogenicity? 


and optimizing their dosage remain the 
toughest yet most important challenges to 
achieve efficient therapeutic gene editing 
in vivo without triggering particle or cargo- 
mediated toxicities. 

Lipid-based formulations are largely up- 
taken by hepatocytes through lipoprotein 
receptors and may trigger hepatotoxicity 
at high doses. Moreover, severe toxicities 
emerged in a clinical trial aiming to deplete 
HSPCs with c-KIT antibody conjugated to a 
toxin, possibly related to target expression 
outside the bone marrow (5). This indicates 
that target antigen and administration regi- 
men for functionalized LNPs should be cho- 
sen carefully. Furthermore, Breda et al. report 
no evidence of germline modification, but 
there remain scientific and ethical concerns 
related to its possible occurrence, particu- 
larly when scaling up the administered LNP 
dose or directing them against molecules dis- 
played on the surface of germline cells. 

The occurrence and clinical importance 


quent undesired genetic events is being 
investigated in gene-edited cells, including 
human HSPCs. According to the choice of 
editing tool, these span from genome-wide 
off-target activity to large chromosomal 
aberrations (6, 7) and activation of cellu- 
lar responses that have permanent adverse 
impacts on cell fate (8). Prevention of RNA 
payload expression through microRNA- 
mediated detargeting strategies (9), as also 
applied by Breda et al., may constrain the 
extent of such events outside of the cells of 
interest. Within the target cell population, 
however, these adverse outcomes might still 
occur and escape detection until aberrant 
clones propagate. 

Nongenotoxic conditioning regimens are 
a long-sought goal of HSC transplantation, 
particularly for nonmalignant hematologic 
indications (J0). The application to HSC 
gene therapy may be particularly favorable 
because even partial chimerism, with au- 
tologous gene-corrected cells replacing only 


of heterogeneous and unexpectedly fre- 
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a fraction of the HSPC in the bone marrow, 


could provide substantial benefit in many 
diseases. Targeted expression of proapop- 
totic factors in HSPCs through LNP-mRNA 
delivery comes as an intriguing alternative to 
other nongenotoxic conditioning approaches 
under investigation, which exploit extensive 
mobilization (77) or HSPC-specific antibodies 
conjugated or not with a toxin (72) to egress 
or deplete resident HSPCs, respectively. 

A growing concern for all HSC gene 
therapy strategies comes from the pos- 
sible consequences of poor engraftment of 
engineered HSPCs and limited or altered 
clonal composition of the reconstituted he- 
matopoiesis. Beside the short-term risk of 
graft failure, competition for survival and/ 
or growth among cells with different fitness 
and increased replication stress may select 
or amplify HSC clones bearing mutations 
associated with clonal hematopoiesis and 
predisposing to hematological malignan- 
cies (13). These adverse effects may be ag- 
gravated by an inflamed or damaged bone 
marrow niche, disease-specific factors, and 
notably, attempts at in vivo HSC engineer- 
ing that result in poor efficiency and require 
selection of the modified cells. Although the 
clinical implications of such events remain 
to be determined, monitoring of hemato- 
poietic clonality should be implemented in 
clinical trials of any HSC genetic engineer- 
ing strategy for early capture of increased 
emergence of clonal hematopoiesis. 

Given that targeted LNP-mRNA could 
be the next game-changing technology for 
HSC gene therapy, further investigation 
of platform and process improvements 
should be pursued at the preclinical level. 
Meanwhile, patients should not be prema- 
turely exposed to unwarranted risks when 
the current experience of ex vivo HSC gene 
therapy continues to show robust and du- 
rable clinical benefits. 
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Roger Searle Payne (1935-2023) 


The man who discovered that whales sing 


By Diana Reiss! and Stuart Firestein? 


oger Searle Payne, the biologist who 

pioneered studies of whale behavior 

and communication and advocated 

for their protection, died on 10 June. 

He was 88. Payne was widely known 

to both scientists and the public for 
his groundbreaking discovery of the songs of 
humpback whales. 

Born on 29 January 1935 in New York City, 
Payne received a BA in biology from Harvard 
University in 1956 and a PhD in animal be- 
havior from Cornell University in 1961. 
From 1966 to 1984, he served as a biol- 
ogy and physiology professor at The 
Rockefeller University in New York. Con- 
currently, he worked as a research zoolo- 
gist at the New York Zoological Society 
(NYZS), now known as the Wildlife Con- 
servation Society. He also devoted time 
to the Institute for Research in Animal 
Behavior, a joint endeavor between the 
NYZS and The Rockefeller University. In 
1971, Payne founded Ocean Alliance, an 
organization established to study and 
protect whales and their environment, 
and he remained its director until 2021. 

Initially, Payne’s research focused on 
auditory localization in moths, owls, and 
bats, but he changed course to focus on 
conservation and selected whales for 
their status as a keystone species. In 1967, 
he and his then-wife Katharine (Katy) first 
heard the distinctive sounds of the hump- 
back whales on a secret military recording 
intended to detect Russian submarines off 
the coast of Bermuda. Payne and his collabo- 
rators, including Scott McVay and Frank Wat- 
lington, were the first to discover that male 
humpback whales produce complex and 
varied calls. Mesmerized by the recordings, 
Payne realized that the recurring pattern and 
rhythmicity constituted a song. He published 
his findings in a seminal Science paper in 
1971. After many years and many additional 
recordings, he and Katy further realized that 
the songs varied and changed seasonally. 

These hauntingly beautiful whale songs 
captured the public’s attention thanks to 
Payne's extraordinary vision. He released an 


1Department of Psychology, Hunter College, City University 
of New York, New York, NY, USA. Department of Biological 
Sciences, Columbia University, New York, NY, USA. Email: 
dreiss@hunter.cuny.edu; sjf24@columbia.edu 


380 28 JULY 2023 + VOL 381 ISSUE 6656 


album, Songs of the Humpback Whale, in 
1970 that included a booklet in English and 
Japanese about whale behavior and the dire 
situation that many species of whales faced. 
He recognized the power of juxtaposing the 
plaintive and ethereal songs of humpbacks 
with images of whaling. The album became 
the acoustic icon for the environmental move- 
ment, birthing the slogan “Save the whales!” 
and leading to the 1972 US Marine Mammal 
Protection Act, landmark legislation that 
brought about the end of large-scale whaling 
in the United States and saved several whale 


populations from extinction. Continued ef- 
forts led to a global ban on whaling passed 
by the International Whaling Commission in 
1982; only Japan and Norway refused to sign. 
The album remains the most popular nature 
recording in history, with more than two mil- 
lion copies sold. Humpback whale songs are 
now carried aboard the Voyager spacecrafts 
as part of the signature of our planet. 

Payne was the first to suggest that 
fin whales and blue whales could commu- 
nicate with sound across entire oceans, a 
theory that was later confirmed. His work led 
to insights into the acoustic communication, 
genetics, and demographics of whale species. 
He authored the book Among Whales (1995) 
and won acclaim for his contributions to sci- 
ence and public engagement. Prince Bern- 
hard of the Netherlands named him a Knight 
of the Golden Ark in 1978. In 1984, he won 
a MacArthur Fellowship. The WWF named 
him a Member of Honor in 1980, and the 
United Nations Environment Programme in- 


x. 
cluded him in its Global 500 Roll of Hor (peg 
in 1988. In 2007, he received Oxford Univer- 
sity’s Dawkins Prize. 

A tireless advocate, Payne served on nu- 
merous international scientific and conserva- 
tion advisory boards, most notably the board 
of directors of the Sea Shepherd Conservation 
Society, for which he helped to create a sci- 
ence-driven and prevention-based strategy. 
He also served as principal adviser to Proj- 
ect CETI (Cetacean Translation Initiative), an 
interdisciplinary nonprofit seeking to decode 
the communication of sperm whales. 

We were close friends of Roger’s, and one 
of us (D.R.) is a fellow marine mammal sci- 
entist. During a recent spirited conversa- 
tion, Roger wanted to be sure we understood 
his ideas for Project CETI. In characteristic 
fashion, Roger saw this project both as be- 
ing scientifically valuable and as having the 
potential to generate respect and awe for . 
vulnerable mammals. We also discussed 
the encouraging signs of progress in pro- 
tecting the endangered vaquita, a spe- 
cies that Roger long believed could be 
rescued with sufficient effort. 

A combination of hubris and humil- 
ity contributed to Roger’s success. He 
believed that he could understand and 
affect the world but tempered that con- 
fidence by acknowledging how little 
we know and how much we need each 
other. He created consensus by instilling 
in others the awe and wonder he felt to- 
ward living creatures. Recognizing that 
his goals transcended science, he col- 
laborated with artists as well. Roger him- 
self was an accomplished cellist, and he 
formed a lasting friendship with author 
Cormac McCarthy on the basis of their 
mutual belief in the power of narrative 
to win hearts and minds. 

Roger’s unstinting efforts on behalf of 
whales and other creatures may be the most 
successful example of translational science— 
his work connected with people emotionally, + 
causing them to change their way of think- . 
ing and spurring governments to action. His 
enduring translational efforts through public 
science communication and the arts have 
opened hearts worldwide and, as he wished, 
made people around the world fall in love 
with whales. His research and conservation 
efforts will continue to inspire scientists. 

John Donne’s poem “For whom the bell 
tolls” contains Roger’s favorite quote. The line 
begins with the famous phrase “No man is an 
island,” which in Roger’s mind extended to 
all living creatures, and ends with the phrase, 
“send not to know / For whom the [funerary] 
bell tolls, / It tolls for thee.” The loss of one— 
be it whale or human—is a loss for us all. This 
idea was Roger’s guiding light. 
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Tracing the ocean’s topography 


Scientific curiosity and commercial interests 


drive seafloor mapping efforts 


By Carmen Gaina 


rom the coastlines and shallow waters 

to the darkest ocean abyss, there are 

still many mysteries to uncover about 

the 70% of land that lies underwater 

on our planet. In particular, the deep 

ocean floor, most of it >2500 m be- 
low sea level, fascinates scientists, explor- 
ers, adventurers, and the business world. 
In 2017, representatives from these com- 
munities launched Seabed 2030, an ambi- 
tious international initiative to compile a 
detailed global seafloor map by 2030. This 
effort attracted the interest of ocean jour- 
nalist Laura Trethewey, and her new book, 
The Deepest Map: The High-Stakes Race to 
Chart the World’s Oceans, beautifully re- 
veals the various threads involved in build- 
ing this intricate tapestry. 

Sonar (sonic navigation and ranging)— 
an instrument invented at the beginning 
of the 20th century and since perfected— 
is central to this story. It is regularly em- 
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ployed aboard small fishing boats and 
large research vessels alike, where it uses 
sound to image the hidden relief of the 
seafloor. But The Deepest Map is more 
focused on the unsung heroes of ocean 
mapping than on technological 
innovation. These individuals 
spend months, years, and even 
decades at sea and in their offices 
facilitating the acquisition of the 
myriad datapoints that are trans- 
formed into the seabed map. 

Trethewey’s tale begins with 
Cassie Bongiovanni, a young fe- 
male marine geologist who iden- 
tified the five deepest points of 
Earth’s major oceans as targets 
for explorer Victor Vescovo’s Five 
Deeps Expedition. Weaving together sto- 
ries about the expedition with the history 
of mapmaking, she reminds readers about 
the beauty and the perils of exploring the 
oceans—a vast territory that can bring to- 
gether or divide people and nations. 

I particularly liked how Trethewey cen- 
ters female mapmakers, beginning with 
the amazing Marie Tharp, a geologist at 
the Lamont Geological Observatory (now 


The Deepest Map 
Laura Trethewey 
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304 pp. 
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corals from the Balanus Seamount, located off the 
coast of Massachusetts in the Atlantic Ocean. 


the Lamont-Doherty Earth Observatory) 
who quietly dedicated her career to un- 
covering the longest mountain chain on 
Earth—the mid-ocean ridge—and bringing 
it to the public’s attention. Readers are also 
reminded of the powerful marriage of sci- 
ence and art, as evidenced by how Tharp’s 
physiographic maps were transformed by 
the painter Heinrich Berann into spaces 
“where someone might go for a stroll.” 

Manned missions capture the public’s 
attention, but deep-ocean drones that can 
collect large datasets without a human 
crew are increasingly doing the heavy lift- 
ing on the seafloor. While these more en- 
vironmentally friendly uncrewed vessels 
can finish the mapping more efficiently, 
an important part of the process—ocean 
storytelling—is lost when humans cease to 
journey into the deep themselves. 

To know or not to know the secrets of 
the ocean floor, this is one of the questions 
raised in The Deepest Map. While detailed 
bathymetric maps can now be made and 
used, for example, by small fishing commu- 
nities or by archaeologists in search of clues 
about human history before sea level rises, 
ocean maps may also be used for nefarious 
purposes and could help facilitate the de- 
struction of pristine habitats. Science and 
industry may push for more knowledge, but 
how much of the natural world are we pre- 
pared to unsettle in its pursuit? 

As I write this review, there are a few 
more days left until the deadline given to 
the International Seabed Authority to come 
up with regulations for deep-sea mining in 
international waters. Although 
many global companies have 
vowed to avoid sourcing from 
the deep sea, others cannot wait 
to mine vast underwater regions 
that harbor untouched habitats. 
Trethewey reminds readers that 
nature is usually at its best if it is 
left alone, and she encourages us 
to be part of it by gently acknowl- 
edging its right to privacy. 

In the same decade that hu- 
mankind set foot on the Moon, 
we also reached the deepest point of our 
planet’s ocean, yet space exploration has 
since earned a much bigger share of atten- 
tion and funding. With The Deepest Map, 
Laura Trethewey seeks to shift this balance, 
offering readers superb insight into the 
world of ocean mappers, explorers, adven- 
turers, and their supporters. 
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Venturing into the unfathomable 


A writer examines our evolving relationship 


with the deep ocean 


By Erik Cordes 


he Underworld is a personal tale of 

the ways in which humans interact 

with the deep ocean. Author Susan 

Casey delves into the past to look at 

how the oceans were first explored 

and into the future to see how we 
might use (and abuse) them in the com- 
ing decades. The book also follows her own 
journey of marine exploration, from her 
early connections to the ocean to a memo- 
rable dive into the deep. 

The Underworld is written in plain 
language, and Casey employs entertain- 
ing tongue-in-cheek humor, proclaiming 
early on that we are now “quite sure the 
abyss is dragon-free.” Despite the absence 
of dragons, we humans have long feared 
the deep ocean, an anxiety on full display 
during the recent search for the submers- 
ible that tragically imploded enroute to the 
final resting place of the shipwrecked Ti- 
tanic. Casey wonders whether perhaps this 
is why the annual budget of NASA dwarfs 
that of the National Science Foundation 
and the National Oceanic and Atmospheric 
Administration and why much less money 
has been spent on the exploration of the 
seafloor—99% of which has never been 
viewed by human eyes—than on space. 

Casey uses technical scientific language, 
giving scientific names of species and taxa 
in her descriptions, and offers vivid, but 
not hyperbolic, pictures of the fanciful 
creatures and alien habitats she encoun- 
ters. “In the deep,” she writes, for example, 
“there are creatures...with glass skeletons” 
and ones that “might have two mouths or 
three hearts or eight legs.” She does a re- 
markable job of rendering such species lik- 
able and even inspirational. 

Throughout the book, Casey conveys her 
love of the sea. This accomplishes something 
quite difficult and rare: It allows the reader 
to develop a personal connection with the 
deep ocean. She talks about how her own mix 
of “wonder and fear” of the ocean creates a 
sense of “the sublime’—emotions that accom- 
pany her as she embarks on a dive to the sea- 
floor aboard an 11-ton deep-sea submersible. 
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“Tf you were to walk across the Diaman- 
tina Trench your feet would sink into soft 
sediment,’ Casey observes, inviting readers to 
consider a feature of the underwater world 
that they can relate to. She finds more famil- 
iar territory in the Denmark Strait Cataract— 
“the planet’s tallest, mightiest waterfall” that 
just so happens to be 2000 feet underwater. 

Readers are introduced to a number 
of individuals who have been to the deep 
ocean and studied its mysteries. These in- 
clude Leonardo da Vinci, who cautioned 
against the overexploitation of the ocean, 
and Don Walsh, one of the first two hu- 
mans to visit the Mariana Trench, who 
relays a story of “sphincter-clenching ter- 
ror” experienced when an external part of 
the vessel he was aboard cracked at 31,000 
feet below sea level. (The damage proved 
harmless, but “it got our attention,” notes 
Walsh.) As summarized by submersible pi- 
lot Buck Taylor: “The ocean has a way of 
reminding you who’s in charge.” 

These experiences in a submersible, 
however, stand in stark contrast to Casey’s 
own, which elicited feelings of “ecstasy” 
rather than terror. She recounts “a peace- 
fulness that comes from knowing your 
place in the true order of things” and 
describes returning to the surface when 


SCIENCE, SEX, AND GENDER 


South African intersex athlete Caster Semenya’s recent legal victory 
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“the hypnotic blue reestablished itself, in- 
creasing in intensity until it was almost 
unbearable.” Her writing in this section 
articulates a feeling that I have always had 
trouble describing and provides one of the 
most accurate and vivid portrayals of a 
deep-sea dive that I have ever read. 

Throughout the book, Casey frequently 
returns to the idea that humans have the 
potential to irreversibly alter the func- 
tion of the deep ocean, which is the larg- 
est habitat on Earth. A chapter on seafloor ,. 
mining contrasts our desire for minerals 
and potential monetary gains with conser- 
vation goals. Marine biologist Sylvia Earle 
sees this as a key issue at a critical moment 
in time. “Our highest priority must be to 
safeguard whatever remains of the natu- : 
ral carbon-capturing systems,” she tells 
Casey, “and by far the largest, relatively un- 
disturbed, intact part of the planet is the 
deep sea.” 

“The deep isn’t merely a part of our © 
planet—it zs our planet,” Casey writes in the 
opening pages of The Underworld. “You'd 
think we would want to be more familiar 
with it.” Although I am highly biased, I must 
agree, and I think any reader of this book 
will be convinced of the same. 

10.1126/science.adi9396 
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Legacies in South African Medicine . 
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The inaccurate idea that intersex 
births are more common in Black 
South Africans emerged from 
theories put forward by European 
colonizers determined to find differ- 
ences between races. It persists 
to this day and plays a role in per- 
petuating the discrimination faced 
by intersex individuals of all races. 
This week on the Science podcast, 
Amanda Lock Swarr discusses the 
sloppy science that inspired and 
maintains this myth, the damage 
it has wrought, and the activists 
calling for change. 

bit.ly/46BPC88 
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LETTERS 


Edited by Jennifer Sills 


Editor’s note 


On 30 September 2021, Science published 
the Research Article “Light-induced 

mobile factors from shoots regulate 
rhizobium-triggered soybean root nodula- 
tion” by T. Wang et al. (2). On 30 June 2023, 
an Editorial Expression of Concern alerted 
readers that some missing data had been 
brought to the editors’ attention (2). The 
authors have now corrected the paper. As 
described in an Erratum (3), GmNIN expres- 
sion data have been added to Fig. 5, and 

the supplementary materials have been 
updated. These changes have addressed 
concerns about the integrity of the paper. 
Therefore, Science has removed the Editorial 
Expression of Concern and posted this noti- 
fication in its place to indicate the editors’ 
confidence in the Research Article’s data 
and conclusions. We thank the community 
for bringing these issues to our attention. 


H. Holden Thorp 
Editor-in-Chief, Science 
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Keep the Salween River 
free-flowing 


Free-flowing river ecosystems are essen- 
tial to the protection of biodiversity and 
the provision of ecosystem goods and 
services (1, 2). Yet, at a global scale, the 
flow and connectivity of most large rivers 
have been disrupted by the construction 
of dams to meet the growing population’s 
soaring demands for water, energy, and 
food production (3, 4). Consequently, 
freshwater biodiversity is now imperiled 
in most large river basins around the 
world, freshwater fisheries are collaps- 
ing, and estuarine deltas are shrinking 
due to sediment capture upstream (2, 

5, 6). Much attention is now focused 


on mitigating the impacts of immense 
development in heavily modified river 
basins, but protecting the few remain- 
ing large free-flowing rivers is equally 
important, especially those where devel- 
opment is imminent (4, 7), such as the 
Salween River in southeast Asia. 

The Salween River is the longest : 
free-flowing river in southeast Asia 
and one of only two such rivers in the 
region (together with the neighboring 
Irrawaddy River). The Salween sup- 
ports some of the most biodiverse areas 
in the world, including high levels of 
endemism, and it provides some of the 
only remaining free-flowing habitats for 
previously widespread species, such as 
anguillids, that have disappeared from 
other major river basins because of 
dams (6, 8). 

The Salween Basin is also home 
to diverse Indigenous peoples who 
depend on the river for their livelihoods 
and maintain deep spiritual connec- 
tions and a reciprocal interdependency 
with the river system. For example, 
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the Karen of Thailand establish fish 
sanctuaries with formalized rules and 
penalties for violations. The community 
conducts annual prayer ceremonies 
to bless endemic fish species (such as 
Garra spp.) and actively protects their 
spawning habitats by managing water 
levels and warding off predators (9). 
Dam construction has been strongly 
and successfully opposed in the 
Salween Basin for many decades. 
However, the demand for regional eco- 
nomic recovery in the post-pandemic 
era could renew enthusiasm for 
Salween hydropower projects. To 
protect the river’s free-flowing status, 
the Salween Basin nations should 
establish a robust intergovernmental 
organization with academic backing to 
facilitate communication and coopera- 
tion among stakeholders and to assess 
the costs and benefits of any decisions 
to construct dams. Such efforts must 
incorporate the interests, values, and 
long-held knowledge of the river held 
by Indigenous communities, which 
stand to be most affected by any loss to 
the Salween’s free-flowing status. 
Safeguarding the river will also 
require adjustments to the energy 
structure and economic development 
model in the region. For example, gov- 
ernments should prioritize industries 
that use less energy, such as nature- 
based tourism, and promote other 
renewable energy sources, such as solar 
and wind power. Countries through 
which the Salween flows should resist 
the idea that building dams in the few 
remaining free-flowing rivers is inevi- 
table. Instead, they should celebrate 
and protect the unique environmental, 
social, economic, and spiritual values 
that only such river basins can provide. 
Juan Tao’, Nick Bond?, Nyo Nyo Tun’, 
Chengzhi Ding’* 
tYunnan University, Kunming 650500, China. 
Centre for Freshwater Ecosystems, La 
Trobe University, Wodonga, VIC, Australia. 
3Mawlamyine University, Mawlamyine 
CMQ3+9F, Myanmar. 
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The author and his 
grandmother sit by a 
tribute’to education 
in the town square of 
Jiangsu,China. 


PAST AS PROLOGUE 


One person’s trash: 
Another’s treasured education . 


| was born ina small Chinese village in the 1990s. My grandmother was a humble, 
hardworking farmer, but she was never able to earn much money from the land. After my 
parents’ divorce, | went to live with her and discovered that she had a secret part-time 
job. After school each day, | would join her in the landfill as a trash picker. She collected 
recyclable bottles while | hunted for unfinished ballpoint pen refills to reuse. The landfills 
were my first encounter with the world outside of my small town—some of the trash had 
come from the West. 

Although she had never finished primary school, my grandmother always dreamed 
that | would go to college. With the money she made selling bottles, she bought food and 
supported my education. Throughout my upbringing, she raised me to believe that the 
possibilities were limitless. The confidence she instilled allowed me to move to the United 
States, earn a PhD in mechanical engineering, and pivot into chemical engineering dur- 
ing my postdoc fellowship. 

| now understand that the trash from rich countries redirected to the landfills in China 7 
is part of a cycle of environmental injustice. My grandmother considered the plastic 
bottles she found to be treasures because they could support us financially, but millions 
of tons of plastic trash are still exported from developed countries every year to be 
stored or burned in emerging economies, polluting the ground, water, and air. ‘ 

As | came to see my childhood experiences in a different light, | developed an interest 
in designing technologies that can address environmental injustice. | decided to work 
toward transforming carbon emissions into sustainable commodities, with the goal of 
benefiting marginalized communities. My grandmother inspired these efforts, and | hope 
my work will improve the lives of those like her. 

When | visited my grandmother in 2019, | discovered that the town government had 
posted a tribute to education in the central square, which includes my story and encour- 
ages others to pursue their dreams. The display reminds me how far | have come, but the 
square also holds the memory of where | began: It was one of the places my grand- 
mother and | frequented to collect trash. 

Xiangkun Elvis Cao 


Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 
02139, USA. Email: elviscao@mit.edu 
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Call for Submissions Past as Prologue is an occasional feature highlighting the role of family history in the life of 
scientists. What role did your family background play in your decision to pursue science, your field, or your career? 
Submit your story to www.submit2science.org. 
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US Supreme Court opinion 
harms watersheds 


The United States has lost vast amounts of 
historic wetlands. Twenty-two states have 
lost more than 50% of their wetland area, 
and some have lost more than 80% (J). In 
California, only 9% of the 5 million acres 
of wetlands that existed in 1850 remain 

(1, 2). The federal Clean Water Act (CWA) 
(3) has safeguarded US wetlands since 
1972, but a recent Supreme Court decision 
removes the majority of US wetlands from 
federal protection. 

The degree to which the CWA was 
meant to apply to wetlands and stream 
tributaries, in addition to navigable 
waters, has been a subject of legal debate, 
but the United States depends on these 
vulnerable ecosystems (4). Small, tempo- 
rary, and seasonal streams and wetlands 
maintain hydrological, chemical, and 
biological functions that are essential in 
sustaining human well-being, ecological 
health, and the economy (5, 6). Wetlands 
outside of floodplains, such as prairie 
potholes, provide US$673 billion per year 
in ecosystem services, and headwater 
streams contribute US$15.7 trillion per 
year to the US economy (4). 

On 25 May, in the case of Sackett v. US 
Environmental Protection Agency, the US 
Supreme Court declared that a wetland, 
to be afforded CWA protection, must have 
a continuous surface connection with an 
ocean, river, stream, or lake, a require- 
ment that demonstrates a fundamental 
lack of understanding of how natural 
waters function. Relying on physical con- 
nectivity of surface waterbodies alone 
also ignores watersheds’ chemical and 
biological connections. Under the Court’s 
reasoning, even the 2020 Navigable 
Waters Protection Rule, which also misin- 
terpreted science and ignored the CWA’s 
goals (7, 8), protected too many waters. 

Although the Court’s opinion focuses 
on wetlands, it also jeopardizes non- 
perennial streams, which include 59% of 
all streams in the conterminous United 
States and more than 81% of streams in 
the arid and semi-arid Southwest (9, 10). 
As a result, ecosystem services of water- 
sheds across the United States are threat- 
ened, including water quality and quan- 
tity, flood protection and mitigation, and 
the maintenance of biodiversity, including 
endangered species as well as recreation- 
ally and commercially valuable fish like 
salmon and herring (6). 

The Court’s decision has substantially 
weakened water protection at a time 
when protections should be buttressed. 
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In addition to massive losses and impair- 
ment of aquatic resources nationwide, 
warmer temperatures and altered pre- 
cipitation regimes associated with global 
climate change are expected to further 
accelerate wetland loss (6, 11). Congress 
could and should remedy this situation, 
but barring a congressional act, protec- 
tion for these aquatic resources shifts 

to the states, only 19 of which currently 
have comprehensive regulatory programs 
for nontidal wetlands and freshwater 
resources (12). The US Environmental 
Protection Agency and Army Corps of 
Engineers will attempt to readjust regula- 
tions to align with the Court’s opinion, 
but the impacts of this decision will ripple 
through the nation’s waters with long- 
lasting, detrimental effects. 

S. Mazeika Patricio Sullivan’* and 


Royal C. Gardner? 

‘Baruch Institute of Coastal Ecology and Forest 
Science, Clemson University, Georgetown, SC, 

USA. Institute for Biodiversity Law and Policy, 
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NEXTGEN VOICES: 
HISTORIC INTRODUCTIONS 


Add your voice to Science! Our new 
NextGen Voices survey is now open: 
Imagine that you can introduce any two 
scientists, regardless of when or where they 
lived. Which scientists would you introduce, 
and how could their collaboration change the 
course of history? 


To submit, go to 
www.science.org/nextgen-voices 


Deadline for submissions is 11 August. 
Aselection of the best responses will be 
published in an upcoming issue of Science. 
Submissions should be no more than 

150 words. Anonymous submissions will not 
be considered. 
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TECHNICAL COMMENT ABSTRACTS 


Comment on “Ultrastructure reveals ancestral 
vertebrate pharyngeal skeleton in yunnanozoans” 


Kaiyue He et al. 

Tian et al. (Reports, 8 July 2022, p. 218) 
hypothesized that yunnanozoans are stem- 
group vertebrates on the basis of “cellular 
cartilage,” “fibrillin microfibers,” and “sub- 
chordal rod” associated with the branchial 
arches of yunnanozoans. However, we 
reject the presence of cellular cartilage, 7 
fibrillin, and the phylogenetic proposal of 
vertebrate affinities based on ultrastructure 
and morphology of yunnanozoans from 

more than 8000 specimens. ‘ 
Full text: dx.doi.org/10.1126/science.ade9707 


Comment on “Ultrastructure reveals ancestral 
vertebrate pharyngeal skeleton in yunnanozoans” 


Xi-guang Zhang and Brian R. Pratt 

Tian et al. (Reports, 8 July 2022, p. 218) 
claim that Cambrian yunnanozoan animals 
are stem vertebrates, based partly on their 
observation at the nanometer scale of 
microfibrillar tissue located in the branchial 
arches. They interpret this to represent cel-  ¢ 
lular cartilage with an extracellular matrix 

of microfibrils. Instead, we argue that 

the “microfibrils” are more likely modern 
organic contamination. 

Full text: dx.doi.org/10.1126/science.adfl472 


Response to Comments on “Ultrastructure 
reveals ancestral vertebrate pharyngeal skeleton 
in yunnanozoans” 

Qingyi Tian et al. 

He et al. dispute our anatomical inter- 
pretations on the structures of cellular 
chambers and microfibrils in yunnanozoan 
branchial arches and put forward alterna- 
tive interpretations on these structures. 
Zhang and Pratt argue that the microfibrils 
we identified in yunnanozoans are more 
likely modern organic contamination. We 
provide additional evidence to support our 
interpretations and dismiss the alternative 
interpretations. 

Full text: dx.doi.org/10.1126/science.adf3363 
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ARCHAEOLOGY 


Who were the servants 
at Machu Picchu? 


he famous site of Machu Picchu 
was the rural estate of Pachacuti, 
one of the most powerful emper- 
ors of the Inca empire. It was a 
seasonal retreat for the royal 
lineage, whose members were served 
by a permanent staff of retainers. 
Ethnohistorical sources suggest that 
these retainers, all non-Inca, were 
privileged compared with the larger 
population. Little is known, however, of 
their origins and homelands. To explore 
this, Salazar et al. analyzed the DNA 
of 34 retainers buried at the site and 
discovered that they were an extraor- 
dinarily diverse group who came from 
locations across the empire. The iden- 
tification of females from Amazonia 
suggests a greater Inca presence in 
that region than has been previously 
understood. —MSA 
Sci. Adv. (2023) 10.1126/sciadv.adg3377 


Ancient DNA from Machu Picchu 
reveals great diversity among, 
and distant origins for, the royal 
servants buried there. 


NEUROSCIENCE 
Learning with nitrosylated 
CaMKIl 


Cognitive function declines 
with age, as does the amount 
of S-nitrosylated protein in 
the brain. This includes the 
kinase CaMKII, which mediates 
synaptic signaling in neurons 
that support learning and 
memory. Rumian et al. found 
that S-nitrosylation of CaMKIl 
was Critical for the localization 
of the kinase to synapses and 
for the response of hippocam- 
pal neurons to a stimulus that 
mimics experience-induced 
brain activity. Cognitive func- 
tion in young mice expressing 
S-nitrosylation—deficient 
CaMkIl was impaired to a 
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similar extent as that in older, 
wild-type mice. —LKF 
Sci. Signal. (2023) 
10.1126/scisignal.ade5892 


QUANTUM SIMULATION 
Teasing out the effects 
of interactions 


Ultracold atomic gases can 
simulate Hall effects in solids 
despite being electrically 
neutral. One of the tricks used 
in investigating this effect is to 
introduce a synthetic dimen- 
sion to the system. Zhou et al. 
used this approach to study 
how interactions affect the Hall 
response. The researchers held 
fermionic ytterbium atoms 

in a one-dimensional optical 


lattice and used two of the 
atomic hyperfine states as the 
orthogonal synthetic dimension. 
Tilting the lattice to generate a 
current enabled the measure- 
ment of the Hall response for 
a range of atomic interaction 
strengths. Consistent with 
recent theoretical predictions, 
the Hall response was found to 
be independent of interaction 
strength for sufficiently large 
interactions. —JS 

Science, add1969, this issue p. 427 


LINGUISTICS 
Commonalities in word 
meaning extension 


Humans creatively describe new 
things that lack words, but must 
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do so by relying on known words 
in their vocabulary. This process 
of reusing the same word for 
multiple meanings, which is 
called “word meaning extension,” 
occurs at the individual level 
(short term during childhood) 
as well as the population level 
(long-term language evolution). 
One long-standing issue has 
been whether the patterns that 
children use (individual level) to 
extend new meanings to known 
words are similar to patterns 
implicated in longer-term, 
historical language evolution 
(population level). Brochhagen 
et al. developed a computa- 
tional framework to examine 
this question in more than 1400 
languages at both the individual 
and population levels (see the 
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Perspective by Greenhill). They 
found that word meaning exten- 
sions across both levels share 
common ground because they 
are associated with cognitive 
advantages for learning, remem- 
bering, and understanding 
words. —EEU 

Science, ade7981, this issue p. 431; 

see also adj2154, p. 374 


Purkinje cells with more 
than one input 


Purkinje cells are the primary 
output neurons of the cer- 
ebellum. It is generally thought 
that one Purkinje cell receives 
monosynaptic input from one 
climbing fiber, which forms 
many excitatory synapses with 
it. Busch and Hansel found that 
in contrast to the assumed 
universal one-to-one relation- 
ship, most Purkinje cells in the 
adult human cerebellum receive 
multiple climbing fiber inputs. 
In mice, multibranched Purkinje 
cells show more than one climb- 
ing fiber input. This innervation 
pattern generates independent 
computational compartments 
within single Purkinje cells. These 
results indicate that there is 
more anatomical and functional 
Purkinje cell diversity than tradi- 
tionally assumed and that there 
are substantial differences in 
the prevalence of multibranched 
cells between mouse and human, 
where this dendrite type is pre- 
dominant. —PRS 

Science, adil024, this issue p. 420 


Imaging of mouse and human 
Purkinjie cells reveals interspecies 
differences and an unexpected 
diversity in architectures. 
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Engineering blood stem 
cells in vivo 


Bone marrow stem cells are 
the source of all hematopoietic 
(blood) cells in the body. For 
patients with blood disorders, 
bone marrow transplantation 
with healthy donor marrow can 
be a highly successful therapy 
and can be curative for certain 
conditions. Breda et al. designed 
a strategy to reprogram bone 
marrow stem cells directly within 
the body without the need for 
donor cells or the use of poten- 
tially toxic conditioning regimens 
such as chemotherapy or 
radiation (See the Perspective by 
Ferrari and Naldini). Messenger 
RNA was delivered to bone mar- 
row stem cells by intravenous 
injection in lipid nanoparticles, 
facilitating both gene editing 
and bone marrow transplanta- 
tion. The ability to engineer bone 
marrow cells inside a patient 
without the need for traditional 
transplantation approaches 
could hold promise for a number 
of genetic disorders. —PNK 
Science, ade6967, this issue p. 936; 
see also adj0997, p. 378 


Lighting up amino acid 
synthesis 
Many enzymes have the ability 
to perform chemistry beyond 
the scope of their natural reac- 
tions when put together with 
reactive substrates. Adding light 
to the mix might be a strategy 
for broadening reactivity even 
further by generating radical 
species. Cheng et al. found that 
an engineered tryptophan syn- 
thase could function together 
with an organic photocatalyst to 
produce a range of noncanonical 
amino acid products, including 
those with a beta-methyl group, 
enantio- and diastereoselec- 
tively. The authors propose a 
mechanism in which a radical 
generated by the photocatalyst 
intercepts an aminoacrylate 
intermediate in a reaction cycle 
that partially parallels the natural 
reaction. —MAF 

Science, adg2420, this issue p. 444 
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PLANT METABOLISM 
Colorful carotenoids 


iofortification of vitamin A in crops famously led to 
the development of “golden rice.” However, in some 
cases, enhancing the concentrations of carotenoids 
(vitamin A precursors) is less successful because of 
carotenoid instability. Instead of optimizing native 
plant biosynthesis pathways, Zheng et al. expressed fungal 
carotenoid synthesis enzymes in Arabidopsis and citrus 
tissues. This allowed for the accumulation of provitamin 
Aincytosolic lipid droplets, in which carotenoids are less 
sensitive to light-induced degradation. One application is to 
enhance the carotenoid content of lipid-bearing seeds for 


crop biofortification. —MRS 


Mol. Plant (2023) 10.1016/j.molp.2023.05.003 


It is challenging to fortify plants with unstable carotenoids such as 
vitamin A; golden rice, shown on the left, is an exception. 


Macrophages promote 
motility 

Movement of food along the 
gastrointestinal tract is critical for 


nutrient absorption. Peristaltic 
motility occurs through 


interactions between the enteric 
nervous system and immune 
cells known as macrophages, 
but how this neuroimmune 
cross-talk is regulated remains 
unclear. Pendse et al. found that 
intestinal macrophages produce 
a molecule called complement 
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PROTEIN INTERACTIONS 
Expanding predictions 
at interfaces 


Coevolution is a natural process 
by which two interacting proteins 
can change over time, result- 
ing in different sequences at a 
conserved interface. Analysis of 
this process has been useful in 
protein structure prediction, and 
a deeper understanding of the 
mechanisms may help to aug- 
ment the prediction of interfaces. 
Using a model protein complex, 
Yang et al. generated libraries of 
synthetic proteins with amino 
acids varying at six positions 
within a hydrophobic interface. A 
yeast display coevolution scheme 
allowed for the isolation of pairs of 
sequences that retained binding, 
resulting in a diverse collection of 
interactions based on the same 
starting scaffold. The authors 
then mapped the sequences in 
a coevolutionary network and 
determined structures of 10 pairs 
that provide details of specific- 
ity. They then used a pretrained 
protein language model to expand 
the scope of amino acid pairs, 
demonstrating the ability of this 
hybrid experimental-computa- 
tional approach to give useful 
predictions for protein-protein 
interactions in this system. —MAF 
Science, adh1720, this issue p. 412 


NEURODEGENERATION 


TRIM11 and tauopathies 


Alzheimer’s disease and more 
than 20 other dementias and 
movement disorders collec- 
tively known as tauopathies 

are defined by intracellular 
filamentous inclusions contain- 
ing the microtubule-associated 
protein tau. However, how tau is 
converted from soluble mono- 
mers to insoluble aggregates in 
these diseases remains unknown. 
Zhang et al. analyzed more than 
70 human tripartite motif (TRIM) 
proteins and identified several 
that could reduce tau aggregates 
(see the Perspective by Noble 
and Hanger). Among these, 
TRIM11 was able to maintain tau 
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in its functional soluble form ina 
manner mechanistically distinct 
from canonical protein quality 
control factors. TRIM11 was found 
to be markedly down-regulated 
in sporadic Alzheimer’s disease 
brains, potentially contributing to 
pathogenesis. Indeed, intracranial 
adeno-associated viral delivery of 
TRIM11 provided strong pro- 
tection against tau pathology, 
cognitive decline, and motor 
defects in multiple mouse models 
of tauopathies. -SMH 

Science, add6696, this issue p. 413; 

see also adj0256, p.377 


LANGUAGE EVOLUTION 
Emergence of the 
Indo-European language 


Languages of the Indo-European 
family are spoken by almost 
half of the world’s population, 
but their origins and patterns of 
spread are disputed. Heggarty 
et al. present a database of 109 
modern and 52 time-calibrated 
historical Indo-European lan- 
guages, which they analyzed with 
models of Bayesian phylogenetic 
inference. Their results suggest an 
emergence of Indo-European lan- 
guages around 8000 years before 
present. This is a deeper root date 
than previously thought, and it fits 
with an initial origin south of the 
Caucasus followed by a branch 
northward into the Steppe region. 
These findings lead to a “hybrid 
hypothesis” that reconciles cur- 
rent linguistic and ancient DNA 
evidence from both the eastern 
Fertile Crescent (as a primary 
source) and the steppe (as a 
secondary homeland). —SNV 
Science, abg0818, this issue p. 414 


FRACTURE MECHANICS 
Faster than a 
speeding crack 


Under tension, stresses are ampli- 
fied in a material in the volume 
close to a crack tip, and the crack 
will propagate toward fracture 
when the potential energy 
exceeds the material’s fracture 
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energy. It is generally thought 
that the maximum velocity of a 
moving crack cannot exceed the 
Rayleigh wave speed. Wang et al. 
used brittle hydrogels as a model 
material in which a weak layer in 
the structure in front of the crack 
tip forms a sort of neck in an oth- 
erwise perfect hydrogel plate (see 
the Perspective by Marder). The 
authors observed cracks traveling 
faster than the shear wave speed 
and found that the supershear 
dynamics behaved differently 
from the theories of classical 
cracking. —MSL 

Science, adg7693, this issue p. 415; 

see also adj0963, p. 375 


CELL DEATH 
Macrophages’ pattern 
recognition receptor 


Infection by Gram-negative bac- 
teria results in the accumulation 
of cytosolic lipopolysaccharide 
(LPS) and activation of the non- 
canonical inflammasome, leading 
to pyroptotic cell death. Although 
this process has been proposed 
to occur without a dedicated 
LPS-sensing pattern recognition 
receptor, Rojas-Lopez et al. found 
that the primate-specific protein 
LRP11 is required for effec- 
tive cell death after infection of 
human macrophages by the bac- 
terial pathogen Shigella flexneri. 
LRP11 could bind both LPS and 
caspase-4, and genetic deletion 
of NLRP11 impaired nonca- 
nonical inflammasome activation. 
Together, these findings demon- 
strate that NLRP11 can function 
as a pattern recognition recep- 
tor and is required for efficient 
caspase-4—mediated cell death in 
human macrophages. —CO 
Sci. Immunol. (2023) 
10.1126/sciimmunol.abo4767 


TRANSPLANTATION 
Deep dive into graft- 
versus-host disease 


Numerous studies have relied 
on analyzing peripheral blood to 
understand how T cells mediate 


graft-versus-host disease (GVHD), 
but little is known about these 
responses at the tissue level. 
DeWolf et al. compared site- 
specific T cell receptor (TCR) 
repertoires across a range of tis- 
sues from prospectively collected 
autopsies from patients with or 
without GVHD and from GVHD 
murine models. They consistently 
found similar TCR repertoires in 
tissues from similar anatomic 
sites regardless of patient disease 
status and confirmed that TCRs 
in peripheral blood offered a 
narrow view of the TCR repertoire 
observed in tissues. They also 
detected tissue-resident T cells 
at disease sites in some patients, 
with evidence for donor origin. This 
study provides insight into the T 
cell composition in tissues associ- 
ated with GVHD and highlights 
the powerful insights gained from 
directly analyzing tissues. —CNF 
Sci. Transl. Med. (2023) 
10.1126/scitransImed.abq0476 


DNA NANOTECHNOLOGY 
Automating bends 
in DNA assemblies 


DNA nanostructures have 
attracted interest for the bottom- 
up self-assembly of functional 
materials. However, designing 
important features such as 
vertices and bends is challeng- 
ing and error-prone. Pfeifer et 
al. established an approach that 
leverages algorithms based 
on molecular simulations to 
automate the design of vertices, 
sharp bends, and subtle curves in 
DNA assemblies. Implementing 
these algorithms in a user-friendly 
graphical interface allows for the 
design of free-form geometries in 
minutes. The authors validated the 
approach through the construc- 
tion of nanoscale mathematical 
curves, nozzles, G-clefts, and 
springs. These results may enable 
a design paradigm in which DNA 
assemblies mimic the complex 
architectures and topologies of 
engineered materials. —LNL 
Sci. Adv. (2023) 
10.1126/sciadv.adi0697 
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Perspective by Greenhill). They 
found that word meaning exten- 
sions across both levels share 
common ground because they 
are associated with cognitive 
advantages for learning, remem- 
bering, and understanding 
words. —EEU 

Science, ade7981, this issue p. 431; 

see also adj2154, p. 374 


Purkinje cells with more 
than one input 


Purkinje cells are the primary 
output neurons of the cer- 
ebellum. It is generally thought 
that one Purkinje cell receives 
monosynaptic input from one 
climbing fiber, which forms 
many excitatory synapses with 
it. Busch and Hansel found that 
in contrast to the assumed 
universal one-to-one relation- 
ship, most Purkinje cells in the 
adult human cerebellum receive 
multiple climbing fiber inputs. 
In mice, multibranched Purkinje 
cells show more than one climb- 
ing fiber input. This innervation 
pattern generates independent 
computational compartments 
within single Purkinje cells. These 
results indicate that there is 
more anatomical and functional 
Purkinje cell diversity than tradi- 
tionally assumed and that there 
are substantial differences in 
the prevalence of multibranched 
cells between mouse and human, 
where this dendrite type is pre- 
dominant. —PRS 

Science, adil024, this issue p. 420 


Imaging of mouse and human 
Purkinjie cells reveals interspecies 
differences and an unexpected 
diversity in architectures. 
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Engineering blood stem 
cells in vivo 


Bone marrow stem cells are 
the source of all hematopoietic 
(blood) cells in the body. For 
patients with blood disorders, 
bone marrow transplantation 
with healthy donor marrow can 
be a highly successful therapy 
and can be curative for certain 
conditions. Breda et al. designed 
a strategy to reprogram bone 
marrow stem cells directly within 
the body without the need for 
donor cells or the use of poten- 
tially toxic conditioning regimens 
such as chemotherapy or 
radiation (See the Perspective by 
Ferrari and Naldini). Messenger 
RNA was delivered to bone mar- 
row stem cells by intravenous 
injection in lipid nanoparticles, 
facilitating both gene editing 
and bone marrow transplanta- 
tion. The ability to engineer bone 
marrow cells inside a patient 
without the need for traditional 
transplantation approaches 
could hold promise for a number 
of genetic disorders. —PNK 
Science, ade6967, this issue p. 936; 
see also adj0997, p. 378 


Lighting up amino acid 
synthesis 
Many enzymes have the ability 
to perform chemistry beyond 
the scope of their natural reac- 
tions when put together with 
reactive substrates. Adding light 
to the mix might be a strategy 
for broadening reactivity even 
further by generating radical 
species. Cheng et al. found that 
an engineered tryptophan syn- 
thase could function together 
with an organic photocatalyst to 
produce a range of noncanonical 
amino acid products, including 
those with a beta-methyl group, 
enantio- and diastereoselec- 
tively. The authors propose a 
mechanism in which a radical 
generated by the photocatalyst 
intercepts an aminoacrylate 
intermediate in a reaction cycle 
that partially parallels the natural 
reaction. —MAF 

Science, adg2420, this issue p. 444 
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PLANT METABOLISM 


Colorful carotenoids 


iofortification of vitamin A in crops famously led to 
the development of “golden rice.” However, in some 
cases, enhancing the concentrations of carotenoids 
(vitamin A precursors) is less successful because of 
carotenoid instability. Instead of optimizing native 
plant biosynthesis pathways, Zheng et al. expressed fungal 
carotenoid synthesis enzymes in Arabidopsis and citrus 
tissues. This allowed for the accumulation of provitamin 
Aincytosolic lipid droplets, in which carotenoids are less 
sensitive to light-induced degradation. One application is to 
enhance the carotenoid content of lipid-bearing seeds for 


crop biofortification. —MRS 


Mol. Plant (2023) 10.1016/j.molp.2023.05.003 


It is challenging to fortify plants with unstable carotenoids such as 
vitamin A; golden rice, shown on the left, is an exception. 


Macrophages promote 
motility 

Movement of food along the 
gastrointestinal tract is critical for 


nutrient absorption. Peristaltic 
motility occurs through 


interactions between the enteric 
nervous system and immune 
cells known as macrophages, 
but how this neuroimmune 
cross-talk is regulated remains 
unclear. Pendse et al. found that 
intestinal macrophages produce 
a molecule called complement 


science.org SCIENCE 


PHOTO: SEBASTIEN BOZON/AFP VIA GETTY IMAGES 


component 1q (Clq), which is 
best known as part of the clas- 
sical complement pathway that 
destroys invading microbes. 
However, the authors found that 
Clq-producing macrophages 
were located in close proximity to 
enteric neurons and influenced a 
neuromodulatory gene expres- 
sion program that promoted the 
neuronal function and motility of 
the small and large intestine. It 
seems that in the intestine, the 
key role for Clq is to regulate 
intestinal motility rather than 
immune defense. —PNK 

eLife (2023) 10.7554/eLife.78558 


Changing dislocation 
motion 
The motion of dislocations 
is important for understand- 
ing how alloys deform and 
eventually fail. This process 
is especially important when 
hydrogen is involved because 
this small molecule makes steel 
more brittle. Huang et al. per- 
formed a series of tests showing 
that hydrogen may enhance dis- 
ocation motion in iron. These 
observations help to clarify the 
interaction between hydrogen 
and dislocations in iron, which 
is vital for understanding the 
embrittlement process in cer- 
tain metals. —BG 
Nat. Mater. (2023) 
10.1038/s41563-023-01537-w 


Delivering light from chip 
to free space 


Photonic integrated circuits 
provide a platform for min- 
iaturizing light sources and 
enabling optical devices with 
complex functionality to occupy 
a footprint much smaller than 
the corresponding bulk-optical 
components. The control and 
manipulation of light at visible and 
infrared wavelengths, important 
for communication and coupling 
with atoms and molecules for 
metrology applications, can be 
challenging from a fabrication 
perspective. Spektor et al. used 


SCIENCE science.org 


an inverse design approach to 
generate complex structures ina 
thin layer of tantalum oxide (tan- 
tala) for specific optical function. 
Such “function-first” optimiza- 
tion algorithms produce optical 
functionality from nonintuitive 
structures with excellent perfor- 
mance metrics. By allowing the 
generation of polarized light and 
vortex beams from visible through 
to near-infrared wavelengths, 
the approach provides a route to 
free quantum technologies from 
optical bench, laboratory-based 
environments. —ISO 
Optica (2023) 
10.1364/0PTICA.486747 


How hot is hot? 


What is the best way to measure 
the intensity of an extreme heat 
wave? Russo and Domeisen com- 
pared four different heat-wave 
intensity indices with reanalysis 
data of temperature to show that 
methods using cumulative heat 
values provide a much differ- 

ent picture than ones that use 
averaged values. They found that 
indices based on cumulative val- 
ues are preferable because they 


allow events of different length 
to be compared more easily. 
The cumulative approach shows 
that heat waves after 1986 have 
become nearly 10 times more fre- 
quent and up to three times more 
intense than during the period 
1950-2021. —HJS 

Geophys. Res. Lett. (2023) 

10.1029/2023GL103540 


Costing species invasions 
Invasive non-native species are 
costly because of losses to agri- 
culture, damage to infrastructure, 
and the devastation of ecosystem 
services. Eschen et al. recognized 
that invasive species are increas- 
ing in abundance across the 
British Isles. Currently, more than 
2000 invasive species are repro- 
ducing in the wild, with about a 
dozen becoming established each 
year. Accounting for inflation, 
invasive species are estimated 

to cost the UK economy about 
£4 billion annually. Roughly 

half of these costs result from 

the fungus Hymenoscyphus 
fraxineus, which causes the tree 
disease ash dieback, and from 
the plant pathogenic oomycete 
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Phytopthora ramorum. Other 
major contributors include 
Japanese knotweed, rabbit, 
gray squirrel, and brown rat. It is 
clearly critical that measures are 
adopted to deter yet more spe- 
cies becoming established in the 
British Isles. —CA 
Biol. Invasions (2023) 
10.1007/s10530-023-03107-2 


Hair may help humans 
beat the heat 


Humans differ from our great ape 
relatives in many ways, including 
in the distribution of body hair. 
Lasisi et al. tested the effect of 
scalp hair on thermal regulation 
under a variety of controlled wind, 
solar radiation, and dry or wet 
conditions. Hair of all textures 
reduced heat gain from solar 
radiation, with tightly curled hair 
having the most protective effect. 
Baldness increased heat loss 
through evaporation compared 
with having hair, but the increased 
water loss elevated the risk of 
dehydration. —CNS 

Proc. Natl. Acad. Sci. U. S.A. (2023) 

10.1073/pnas.2301760120 


Keeping hydrated 
during a 2019 heat 
wave in France 
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LANGUAGE EVOLUTION 


Language trees with sampled ancestors support a 
hybrid model for the origin of Indo-European languages 


Paul Heggarty et al. 


INTRODUCTION: Almost half the world’s popu- 
lation speaks a language of the Indo-European 
language family. It remains unclear, however, 
where this family’s common ancestral language 
(Proto-Indo-European) was initially spoken 
and when and why it spread through Eurasia. 
The “Steppe” hypothesis posits an expansion out 
of the Pontic-Caspian Steppe, no earlier than 
6500 years before present (yr B.P.), and mostly 
with horse-based pastoralism from ~5000 yr 
B.P. An alternative “Anatolian” or “farming” hy- 
pothesis posits that Indo-European dispersed 
with agriculture out of parts of the Fertile Cres- 
cent, beginning as early as ~9500 to 8500 yr 
B.P. Ancient DNA (aDNA) is now bringing val- 
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uable new perspectives, but these remain only 
indirect interpretations of language prehistory. 
In this study, we tested between the time-depth 
predictions of the Anatolian and Steppe hy- 
potheses, directly from language data. We re- 
port a new framework for the chronology and 
divergence sequence of Indo-European, using 
Bayesian phylogenetic methods applied to 
an extensive new dataset of core vocabulary 
across 161 Indo-European languages. 


RATIONALE: Previous phylolinguistic analyses 
have produced conflicting results. We diag- 
nosed and resolved the causes of this discrep- 
ancy, two in particular. First, the datasets used 


3 


A DensiTree showing the probability distribution of tree topologies for the Indo-European 

language family. The time axis shows the estimated chronology of the family’s geographical expansion 
and divergence, calibrated on 52 nonmodern written languages. Annotations add chronological context 
relative to selected archaeological cultures and expansions of significant ancestry components in the aDNA 
record. CHG, Caucasus hunter-gatherers; EHG, Eastern (European) hunter-gatherers; BMAC, Bactria- 


Margiana Archaeological Complex. 
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had limited language sampling and widesp Sed 
coding inconsistency. Second, some anal!-.-- 
enforced the assumption that modern spo- 
ken languages derive directly from ancient 
written languages rather than from parallel 
spoken varieties. Together, these methodo- 
logical problems distorted branch-length esti- 
mates and date inferences. We present a new 
dataset of cognacy (shared word origins) across 
Indo-European. This dataset eliminates past 
inconsistencies and provides a fuller and more 
balanced language sample, including 52 non- 
modern languages for a denser set of time- 
calibration points. We applied ancestry-enabled 
Bayesian phylogenetic analysis to test rather 
than enforce direct ancestry assumptions. 


RESULTS: Few ancient written languages are 
returned as direct ancestors of modern clades. 
We find a median root age for Indo-European 
of ~8120 yr B.P. (95% highest posterior den- 
sity: 6740 to 9610 yr B.P.). Our chronology is 
robust across a range of alternative phyloge- 
netic models and sensitivity analyses that vary 
data subsets and other parameters. Indo- 
European had already diverged rapidly into 
multiple major branches by ~7000 yr B.P., with- 
out a coherent non-Anatolian core. Indo-Iranic 
has no close relationship with Balto-Slavic, 
weakening the case for it having spread via 
the steppe. 


CONCLUSION: Our results are not entirely con- 
sistent with either the Steppe hypothesis or 
the farming hypothesis. Recent aDNA evidence 
suggests that the Anatolian branch cannot be 
sourced to the steppe but rather to south of the 
Caucasus. For other branches, potential can- 
didate expansion(s) out of the Yamnaya cul- 
ture are detectable in aDNA, but some had 
only limited genetic impact. Our results re- 
veal that these expansions from ~5000 yr B.P. 
onward also came too late for the language 
chronology of Indo-European divergence. They 
are consistent, however, with an ultimate 
homeland south of the Caucasus and a sub- 
sequent branch northward onto the steppe, as 
a secondary homeland for some branches of — 
Indo-European entering Europe with the later 
Corded Ware-associated expansions. Language 
phylogenetics and aDNA thus combine to 
suggest that the resolution to the 200-year- 
old Indo-European enigma lies in a hybrid of 
the farming and Steppe hypotheses. 


All authors and affiliations appear in the full article online. 
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com); Cormac Anderson (cormacanderson@gmail.com); 
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Language trees with sampled ancestors support a 
hybrid model for the origin of Indo-European languages 


Paul Heggarty**, Cormac Anderson**, Matthew Scarborough“, Benedict King, Remco Bouckaert°, 
Lechostaw Jocz®, Martin Joachim Kiimmel’, Thomas Jiigel®, Britta Irslinger®, Roland Pooth?°, 
Henrik Liljegren™, Richard F. Strand’, Geoffrey Haig'®, Martin Macak"*, Ronald |. Kim’, 

Erik Anonby'*””, Tijmen Pronk”, Oleg Belyaev'*2°, Tonya Kim Dewey-Findell”°, Matthew Boutilier”, 
Cassandra Freiberg“, Robert Tegethoff*”, Matilde Serangeli’, Nikos Liosis**, Krzysztof Stroriski2*, 

Kim Schulte°, Ganesh Kumar Gupta“, Wolfgang Haak?°, Johannes Krause”°, Quentin D. Atkinson””®, 
Simon J. Greenhill®”°, Denise Kiihnert®°*, Russell D. Gray?”* 


The origins of the Indo-European language family are hotly disputed. Bayesian phylogenetic analyses of 


core vocabulary have produced conflicting results, with some supporting a farming expansion out of Anatolia 


~9000 years before present (yr B.P.), while others support a spread with horse-based pastoralism out of 
the Pontic-Caspian Steppe ~6000 yr B.P. Here we present an extensive database of Indo-European core 
vocabulary that eliminates past inconsistencies in cognate coding. Ancestry-enabled phylogenetic analysis 


of this dataset indicates that few ancient languages are direct ancestors of modern clades and produces a root 
age of ~8120 yr B.P. for the family. Although this date is not consistent with the Steppe hypothesis, it 

does not rule out an initial homeland south of the Caucasus, with a subsequent branch northward onto the 
steppe and then across Europe. We reconcile this hybrid hypothesis with recently published ancient 

DNA evidence from the steppe and the northern Fertile Crescent. 


he Indo-European language family en- 

compasses more than 400 languages 

(1, 2). These languages are spoken by 

almost half of the world’s population 

(2), and all derive from the same source 
language: Proto-Indo-European (PIE). For 
more than 200 years, the origins of Indo- 
European have been disputed (3). The deep 
link between the widely dispersed Indo- 
European languages was discovered more than 
two centuries ago (4), but where their com- 
mon ancestral language was initially spoken, 
and when and why it spread so far through 
Eurasia, have remained enigmas ever since. 
Recent debate has focused on two leading 
hypotheses. The Steppe hypothesis posits that 
Indo-European spread out of the Pontic- 
Caspian Steppe, no earlier than 6500 years 
before present (yr B.P.), and mostly with horse- 
based pastoralism from ~5000 yr B.P. (5) (Fig. 


1B). The farming hypothesis claims that Indo- 
European dispersed with agriculture out of 
parts of the Fertile Crescent, beginning as 
early as ~9500 to 8500 yr B.P. (6) (Fig. 1C). 
Linguistic reconstructions of some PIE lex- 
icon, and ancient contacts with early stages of 
the Uralic language family, have been widely 
interpreted as supporting the Steppe hypoth- 
esis (5, 7), but the interpretation of these data 
is controversial (8, 9) (Box 1). In contrast, anal- 
yses of Indo-European basic vocabulary using 
Bayesian phylogenetic methods initially sup- 
ported the time depth and geographical origin 
posited by the farming hypothesis (J0, 11). Re- 
cent papers (/2-14) have challenged those early 
time-depth estimates, in part because the mod- 
el used did not allow ancient languages to be 
directly ancestral to any modern languages. 
When eight ancient languages were constrained 
to be directly ancestral, the date estimation 


for the Indo-European root moved into the time 
frame of the Steppe hypothesis (72). However, 
a considerable problem with this analysis is 
that forcing direct ancestry produces date in- 
ferences toward the tips of the tree that con- 
flict with the known histories of several 
branches of Indo-European. The diversifi- 
cation of Romance languages, for example, is 
inferred to have started only 1000 years ago 
(12), when, in fact, regional differences had 
begun to arise a millennium earlier, as Roman 
expansion itself had already led to “great di- 
versity in the Latin that was spoken around 
the Empire” (75). In this study, we investi- 
gated, diagnosed, and resolved the problems 
in data quality that led to these artifacts in 
dating inferences. 

Human ancient DNA (aDNA) is now also re- 
shaping the debate. Results support a sub- 
stantial influx of genetic ancestry from the 
Eurasian Steppe ~5000 yr B.P., which could 
have carried several of the main branches of 
Indo-European into Europe (J6-18). However, 
this ancestry signal is less evident in aDNA 
from Mycenaean Greece (19), the Balkans (20), 
and Anatolia (27-23), casting doubt on wheth- 
er the Steppe hypothesis can explain the spread 
of all branches of the family, especially in the 
eastern Mediterranean and Asia. This fuller 
aDNA picture “does not support a classical way 
of looking at the steppe hypothesis” (24). 

We overcame the limitations of previous 
linguistic analyses by combining recent ad- 
vances in Bayesian phylogenetic inference with 
afar more extensive Indo-European dataset. 
First, we deployed a sampled ancestor phy- 
logenetic analysis (25) that permits but does 
not force ancient languages to be directly an- 
cestral to modern languages (fig. $5.4). This is 
achieved by using a birth-death-sampling tree 
prior (fig. S5.4) in which a branching event 
in the tree is a “birth” or diversification event, 
and lineage extinction (“death”) events may 
also occur. Each ancient language covered in 
our dataset represents an occurrence of “sam- 
pling” from the entire diversity of Indo-European 
languages through time. Rather than assum- 
ing that ancient languages were the direct an- 


cestors of their modern relatives, this approach 
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Fig. 1. Indo-European languages through space and time. (A) Indo-European languages covered in the 
IE-CoR database: 109 modern languages (round dots) and 52 nonmodern languages (diamonds). An 
interactive version is available at https://iecor.clld.org/languages. Colors distinguish the 12 main clades 

of Indo-European (other potential clades went extinct without sufficient written record). (B to D) Maps 
showing alternative hypotheses for the first stages of Indo-European expansion. The hypothesis of an origin 
in the western steppe (B) contrasts with the hypothesis of an earlier spread with farming (C). The map 

in (D) shows a hybrid of parts of both hypotheses. Date estimates for the start of divergence within each 
main clade are given in years before present. Language labels on the hypothesis maps reflect recent 

end points, not necessarily earlier movements. 
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estimates from the linguistic dataset itself the 
relative probability that any ancient language 
is either a direct ancestor or a sister taxon to 
its closest modern relatives. The model thus 
determines from the data whether, for exam- 
ple, the Proto-Romance source of all modern 
Romance languages goes back directly to 
the lexicon of written Classical Latin, as con- 
strained by one recent analysis (12), or to some 
slightly different, spoken form of “Vulgar” 
Latin. To estimate chronology, we used an 
uncorrel ated relaxed clock to allow different 
language lineages in the phylogeny to vary in 
rates of change over time (26). Cognacy status 
also changes much faster in some types of 
meaning than in others, so we tested different 
approaches to this, using models of cognate 
evolution that allow different rates of change 
for every individual meaning, or for sets of 
meanings that show similar degrees of diver- 
gence in cognacy. 

Second, we identified artifacts in previous 
phylogenetic analyses that result from flaws 
and inconsistencies in the language datasets 
used (27). To resolve these, we implemented 
a methodology for encoding cognate data [see 
supplementary materials (SM) section 2] to 
maximize consistency across the language data- 
set and optimize it as input to phylogenetic 
analysis, creating an entirely new database of 
Indo-European cognate relationships, named 
IE-CoR. IE-CoR covers 161 languages, coded by 
more than 80 specialists on languages of the 
Indo-European family, to provide much den- 
ser and more-balanced sampling both within 
and between the main subclades of Indo- 
European. The 52 nonmodern languages in 
IE-CoR (Fig. 1A) provide a much denser set 
of date calibrations than earlier databases. 


Results 


Our main analysis (Fig. 2) produced an esti- 
mated date for the root of the Indo-European 
language family that is too early to be com- 
patible with the Steppe hypothesis: ~8120 yr 
B.P., with a 95% credible region of 6740 to 
9610 yr B.P. [Date estimates are reported here 
as a median date before present, followed by 
the 95% credible region (highest posterior 
density, or HPD), all rounded to the nearest 
decade, and taking the “present” for modern 
languages as 2000 CE.] The posterior tree dis- 
tribution also contained relatively few cases of 
direct ancestry between language taxa. Of the 
52 nonmodern written languages in the IE-CoR 
database, 27 might theoretically be considered 
potential candidates to be directly ancestral 
to more recent languages in their clades. Old 
English, for example, is potentially ancestral 
to modern English, and Ancient (Attic) Greek 
to modern forms of Greek. Figure 3 shows the 
prior and posterior probabilities for each of 
these nonmodern languages being a direct 
ancestor to any later language(s) in its clade 
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Box 1. Recovering prehistory from languages. 


Languages that derive from the same former ancestor language retain signals of that past origin and of 
their divergence since then. By meticulously comparing the languages within a family, it is possible to 
reconstruct aspects of their common ancestor language. Much of the PIE sound system (phonology) and 
word structure (morphology) has been reconstructed, along with hundreds of individual word forms. 

Linguistics has other methods to then make inferences about prehistory from such language data. 
These qualitative methods are often claimed to support the Steppe hypothesis, but each major inference 


remains disputed. 


* Cladistic analysis of selected characters in phonology, morphology, and cognacy yielded no single 
“perfect phylogeny” (50) but was taken to support a node uniting the Indo-lranic and Balto-Slavic 
branches (5), with putative parallels in aDNA (49). This node rested on only three data characters, 
however. All three are contentious, notably the centum/satem distinction and the “ruki” rule (SM 
section 7.6.2.1). There is no consensus support for this node in Indo-European linguistics, and our 
analysis finds little support for it (a posterior probability of just 0.11). We also tested the effect of 
enforcing this node and found little impact on the root date (Fig. 4, SA6b). 


Apparent ancient loanwords into early stages of the Uralic family (in northern Eurasia) have been 


argued to originate in the Indo-lranic branch of Indo-European and thus to point to the steppe as the 
likely location of such contacts (5). However, other and even earlier claimed loanwords, with Caucasian 
and Semitic languages, are more compatible with an ultimate homeland farther south (54). 


Linguistic paleontology assumes that certain word forms reconstructed to PIE denoted particular 


artifacts, species, and concepts already known to its speakers—most notably the wheel. Reconstruction 
operates through laws of sound change and can thus be precise and reliable on this level. There are no 
comparably strict and predictable meaning laws, however, so it is often much more challenging to pinpoint 

what exact meanings were at specific deep points in time. The same reconstructed word forms have 

thus been inferred as evidence that PIE speakers either already did know of the wheel (5, 65), or that 

they did not yet know of it, and that the invention postdated the common ancestor language (8, 66-68). 
Indo-European origins have remained unresolved because all methods have left scope for interpretation 
and dispute and have failed to bring consensus on the tree topology, chronology, or homeland. For details, see 


SM section 2.2. 


(see also table S5.2). Our ancestry-enabled 
analysis finds posterior probabilities >0.01 for 
only four languages: Classical Armenian (0.50) 
and three ancient forms of Greek (0.72, 0.39, 
and 0.31). Only in two of these cases is the 
posterior probability greater than 50%. We 
found no support for the higher number of 
eight direct ancestors enforced in previous 
analyses (12). These results are driven by the 
cognate data, not our tree prior. In the prior, 
direct ancestry probabilities ranged from 
~42% to 78% for all 27 potential ancestor lan- 
guages, and the median root date estimate 
was 5815 yr B.P. (4149 to 8123 yr B.P.). Includ- 
ing the cognate data shifted the root date 
2305 years earlier, to our result of a median 
age of 8120 yr B.P. in the posterior. 

This lack of direct ancestry may, at first 
sight, seem unexpected. Old English is not 
inferred to be the direct ancestor to modern 
English, nor is Old Icelandic directly ances- 
tral to modern Icelandic. However, it is im- 
portant to clarify what a split between lineages 
represents in phylogenetic analyses of cog- 
nate datasets. A split does not just corre- 
spond to the major difference between discrete, 
mutually unintelligible “languages.” Rather, 
lineages must in principle already be split 
from each other for them to be free to start 
developing differently. Only once lineages are 
split can the first difference(s) emerge between 
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them in the predominant lexeme they use, 
even for just a single meaning in the dataset. 
So even dialects or registers (written versus 
spoken) of the “same” language can represent 
different, parallel sublineages. Thus, ancestry 
between past written languages and contem- 
porary spoken ones may not be fully direct 
(SM section 7). A whole language, taken in 
the broad sense as spanning multiple regis- 
ters and regional variants, therefore need not 
correspond just to a single lineage, but may 
span separate sublineages still very close to 
each other in the phylogeny. “Latin” as a whole 
encompassed both written Classical Latin and 
the spoken ancestor of Romance languages. 
In the history of English, the term “Old 
English” actually refers to a set of various di- 
alects. The IE-CoR Old English data are based 
on West Saxon, as the best documented of 
those dialects. As our results correctly reflect, 
this was not the dialect most directly an- 
cestral to modern English (28). Likewise, the 
Sanskrit of the sacred Vedic texts is not the 
direct ancestor of modern Indic languages 
but was a distinct sister dialect. Even the in- 
tervening Prakrits of Medieval India “do not 
derive from Sanskrit” (29) and, specifically, 
“do not go back directly to the dialect which 
formed the basis of Vedic” (29), which stood 
apart as a “far-western dialect” (30). The formal 
register of a written language typically differs 
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from the contemporaneous spoken language 
in the predominant usage of different words 
in a small proportion of the vocabulary, and 
this specifically includes meanings within the 
IE-CoR reference set of core lexicon. Even a 
near-direct ancestor may thus be expected to 
show some lexical differences with the lineage 
ancestral to modern spoken languages. For 
example, modern Romance languages do not 
derive directly from written Classical Latin (37). 
Instead, “the origins of the Romance languages 
lie in the (irrecoverable) spoken language ... 
[and] there will always be a mismatch be- 
tween the Latin sources and the parent of the 
Romance languages” (32). Even one difference, 
in a single meaning of the 170 in the IE-CoR 
reference set, logically entails separate sub- 
lineages, and that ancestry is not fully direct. 
In the IE-CoR meaning mourn, for example, 
the Classical Latin os was not inherited into 
any modern Romance languages, and so is | 
not considered the primary term in Proto- 
Romance. Most Romance languages use cog- 
nates derived instead from bucca (hence, Italian 
bocca, Spanish boca, and French bouche, for ex- 
ample), which in colloquial Latin was already 
used specifically in the meaning mMoutH as early 
as Cato the Elder (234-149 BCE) (33). This one 
difference is already enough to entail that a 
phylogenetic analysis of primary lexemes (and 
thus cognacy states) between Classical Latin 
and Proto-Romance would correctly return 
these as separate sublineages, and it is not an 
isolated example. In practice, “many Classical 
Latin words do not survive into Romance” (15), 
or survive only sporadically, also in IE-CoR 
core vocabulary, such as Eat and co (15). Our 
ancestry-enabled model returns the standard 
linguistic analysis in this case: that written 
Classical Latin is not in fact directly ances- 
tral to modern spoken Romance languages. 
Specifically, in meanings where Classical Latin 
has a cognate set different to that in all 
Romance languages, the model correctly iden- 
tifies which branch is innovating in each case. 
Even Classical Latin singleton forms are cor- 
rectly identified as retentions, and the Romance 
forms as innovations on the (“spoken”) branch 
to Romance (see SM section 6.3). Likewise, 
written Old Icelandic is not quite directly 
ancestral to modern spoken Icelandic. This 
contradicts the assumptions enforced in ear- 
lier ancestry-constrained analyses (72). Only in 
four cases were specific historical written lan- 
guages [Classical Armenian and some forms of 
Ancient Greek (34, 35)] so close to the ancestor 
of later languages in their clades as to be near- 
ly indistinguishable in the IE-CoR sample of 
core vocabulary. 


Validation, and robustness analyses 


The validity of our results can be evaluated in 
three ways. First, estimates of lineage split 
dates can be validated against known historical 
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Fig. 2. The posterior 
probability distribution 
of trees for the Indo- 
European family. 
Distribution visualized using 
DensiTree (71). The time 
axis shows the estimated 
chronology of Indo- 
European expansion. 
Languages whose tips 

do not reach the right 
edge are the 52 nonmodern 
written languages such 

as Hittite, Tocharian, 
Mycenaean Greek, and 

Old English. These lan- 
guages were used in the 
analysis as time calibra- 
tions. The two gray curves 
show the distribution of 


Median estimate 


for start of 

j Indo-European 

root date estimates for inergence 
the tree. The prior is light 8120 BP 


gray, and the posterior 
estimate is dark gray. 


Density 


data. Ancestry constraints used in previous 
analyses produced lineage split dates far too 
recent to be compatible with known histories: 
no divergence among West Norse languages 
until 1650 CE, none in Romance until 1000 CE, 
and none in Indic until 100 CE (2). These arti- 
facts disappear from the ancestry-enabled 
analysis in Fig. 2. Icelandic and Faroese, for 
example, are now dated as splitting from the 
mainland Scandinavian lineages ~830 CE (470 
to 950 CE), closely in line with the first Norse 
settlement of the Faroes and Iceland in the 
ninth century. Initial divergence within Ro- 
mance is accurately dated to the Roman Em- 
pire in the first centuries CE. Divergence within 
Indic is dated to ~4370 yr B.P. (3640 to 5250 yr 
B.P.), in line with Vedic Sanskrit already being 
slightly divergent from the lineage(s) ancestral 
to modern spoken Indic languages (30). The 
inference of an Indo-Iranic split at ~5520 yr B.P. 
(4540 to 6800 yr B.P.) may, at first glance, seem 
surprising. Established expectations are for a 
more recent date, based on the perceived level of 
similarity between Vedic Sanskrit and Avestan— 
the earliest known ancient languages in the 
Indic and Iranic branches, respectively. How- 
ever, these judgments of linguistic similarity 
have been largely impressionistic (36) rather 
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than quantified. In the precisely defined IE-CoR 
meanings, Early Vedic and Younger Avestan 
share only 58.7% cognacy (37). This matches 
the level of cognacy that survives between the 
most divergent sublineages within the Ro- 
mance clade, for instance, after roughly two 
millennia since the spread of the Roman Em- 
pire. Early Vedic and Younger Avestan them- 
selves date back to at least the mid-fourth and 
mid-third millennia before present, respective- 
ly. A time depth two millennia earlier (~5520 yr 
B.P.) for the split between their lineages (Indic 
versus Iranic) is thus consistent with the 58.7% 
cognacy overlap between them. More widely, 
ancient Indo-European languages show close 
similarities in some aspects of their inflec- 
tional morphology (noun declension and verb 
conjugation) and phonology. These similar- 
ities have often been assumed to imply a rela- 
tively short time span of divergence since their 
common ancestor language, but these impres- 
sions are also unquantified. Our time-depth 
estimate implies a long period of relative sta- 
bility in these aspects, while early Indo-European 
diverged faster in other respects. Resolving these 
apparent contrasts in rates of change in dif- 
ferent aspects of language (38) is a target for 


future research (see SM section 2.2.3). 
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Second, our language tree topology can be 
evaluated against established classifications of 
Indo-European languages. These classifications 
identify 10 to 12 main attested subgroups: 
Anatolian, Tocharian, Albanian, Armenian, 
Greek, Indic+Iranic, Baltic+Slavic, Germanic, + 
Italic, and Celtic. Our analyses (Fig. 2 and fig. ° 
S6.1) returned all of these with 100% posterior 
probability, including the two widely recog- 
nized deeper clades, Indo-Iranic and Balto- . 
Slavic. Beyond this, qualitative methodology 
in historical linguistics has failed to reach a 
consensus on how these main branches relate 
to each other in a higher-order branching, at 
the earliest stages of Indo-European expan- 
sion. Different language data support con- 
flicting tree structures. Classifications are 
either disputed or fall back on an unstruc- 
tured rake (2). Our analysis, however, does 
find strong support for specific deep clades— 
findings that bear directly on interpreting the 
latest aDNA results across Europe (16-19, 23, 39). 
Notably, Greek goes with Armenian, while a 
separate main European clade brings together 
Germanic, Celtic, and Italic (with Balto-Slavic 
as next closest). At the root of Indo-European, 
our results return Anatolian and Tocharian 
as deeply divergent clades. Support for them 
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Fig. 3. Histogram of direct ancestry relationships between languages. The IE-CoR database includes 
52 nonmodern languages (e.g., Ancient Greek, Classical Latin, and Early Vedic Sanskrit). This histogram 
shows how many of these 52 languages are returned as directly ancestral to any other language(s) in the 
dataset. The light-gray distribution shows the prior probability of the number of direct ancestor languages, 
distributed around a modal value of 28. The dark-gray distribution shows the posterior probability 
distribution. Only four languages show a posterior probability of being directly ancestral of >0.01%: Classical 
Armenian (as directly ancestral to modern Armenian) and three historical varieties of Greek [Mycenaean, 
Ancient Greek (the Attic dialect), and New Testament Greek]. See table $5.2. 


forming a joint clade, however, is very lim- 
ited (a posterior probability of only 25.9%). All 
three of the deepest clades have <26% support, 
in line with the lack of consensus among lin- 
guists. This may reflect complex “dialect con- 
tinua” in the early stages of Indo-European 
(40). Toward the tips of the tree, into the his- 
torical period when language relationships are 
most reliably known, our results generally 
make for a close fit with established classi- 
fications, such as the relationships between 
ancient languages in the Greek clade. Within 
the major clades, most of the expected sub- 
groups are also returned. In Romance, for ex- 
ample, the Romanian and Sardinian branches 
are the earliest to split off. Iberian Romance is 
also returned as a subgroup, as are North, West, 
and East Germanic; East and West Slavic; and 
Goidelic and Brythonic Celtic. Finally, we note 
some parts of our maximum clade credibility 
(MCC) tree that are not in line with established 
classifications. The Nuristani languages of 
the Hindu Kush, for instance, are nested more 
closely with their Indic neighbors than ex- 
pected on the basis of other linguistic data, par- 
ticularly phonology. Within Continental West 
Germanic, Frisian and historical varieties of 
German appear misplaced, as do various lan- 
guages within Southwestern Iranic. The sup- 
plement (SM section 8) provides full discussion 
of unexpected parts of the topology. 
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Third, we ran a wide range of analyses to 
test the robustness of our results to alterna- 
tive approaches. To identify the best-fitting 
model of cognate evolution, we first compared 
four models (M1 to M4). Our M1 analysis used 
a continuous-time Markov chain (CTMC) mod- 
el for binary data, with gamma rate heteroge- 
neity. Our M2 to M4 analyses all used a binary 
covarion model, which allows cognates to 
switch between fast and slow rates at points 
on the phylogeny, enabling languages to 
undergo bursts of change. M2 to M4 each 
used a different site model to accommodate 
variation in rates of cognate change. M2 used 
one rate for all meanings, M4 allowed a dif- 
ferent rate for every meaning, and M3 was an 
intermediate, compromise approach using 
eight different mutation rates, according to 
the number of cognate sets per meaning (in 
bins of 1 to 10, 11 to 20, etc.). As shown in Fig. 4 
(M1 to M4), results for the estimated time depth 
of Indo-European were similar across all four 
models. To identify which model performed 
best, we used path sampling to estimate the 
marginal log likelihood of each analysis (47). 
The best-performing model was M3—the binary 
covarion with binned rates (see table S5.4)—so 
we took this as our main analysis, for which 
we report the results here. 

To further test the robustness of our results, 
we continued with this best-fitting model, M3, 


28 July 2023 


but varied the analysis in a series of other 
respects: our sensitivity analyses SA1 to SA10 
(Fig. 4). In SA1, we addressed two particularly 
uncertain date calibrations. Vedic Sanskrit and 
Avestan are among the oldest languages in 
IE-CoR and thus offer especially deep cali- 
bration points. Their dating is controversial, 
however, because no original manuscripts sur- 
vive. We therefore reran our main (M3) model 
with these two deep calibrations removed. The 
effect on the root date for Indo-European 
was negligible: just 94 years (1.16%) older, at 
8214 yr B.P. (6785 to 9571 yr B.P.; Fig. 4, SA1). 
We also repeated the main analysis with the 
dataset adjusted to an alternative handling of 
one type of horizontal transmission (parallel 
loanwords) between language taxa (Fig. 4, 
SA2). Again, the effect on the root age estimate 
was minimal: 7934 yr B.P. (6487 to 9455 yr 
B.P.), that is, 186 years (2.29%) younger. 

We further tested the robustness of our re- | 
sults to conditioning on the root (the first 
branching event), rather than on the origin 
(the beginning of the root branch) as in pre- 
vious analyses (13, 42). This led to a median 
root age 690 years (8.52%) older, with more 
uncertainty: 8812 yr B.P. (6648 to 11,419 yr 
B.P.; Fig. 4, SA3). Counting discrete language 
taxa is complex, given the clinal nature of the 
distinction between language and dialect, so 
we also tested alternative values for the prior 
distribution on the sampling probability at 
present (Fig. 4, SA4). In the main analysis, we 
assumed an underlying present-day language 
diversity of between 400 and 600 languages 
across Indo-European (J, 2). Varying this as- 
sumption does not substantially affect the root 
age (8120 yr B.P.). Assuming 200 to 400 lan- 
guages present today gives a root age of 
8064 yr B.P. (6582 to 9585 yr B.P.), or 56 years 
(0.69%) younger (Fig. 4, SA4a). Assuming 600 
to 800 languages gives 8177 yr B.P. (6838 to 
9595 yr B.P.), or 57 years (0.70%) older (Fig. 4, 
SA4b). For some ancient languages, the sur- 
viving text corpora contain limited data, po- 
tentially biasing the analyses. We therefore 
ran a further sensitivity analysis (Fig. 4, SA5) 
without the 10 languages most affected by 
missing data; this gave a root date just 2 years 
(0.02%) younger, confirming that our main 
analysis is robust to the high proportions of 
missing data in such languages. 

Our topologies are based on the data type 
most tractable for estimating chronology: cog- 
nacy in core vocabulary (27, 38). Established 
language classifications are based mainly on 
phonology and morphology, however. Evo- 
lutionary histories need not coincide exactly 
on these different levels of language. Where 
our cognacy trees most depart from estab- 
lished classifications (for the Nuristani lan- 
guages, Southwestern Iranic, and within West 
Germanic; see SM section 7.1), we tested the 
effect of applying lower-order clade constraints 
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to enforce a topology in line with uncontro- 
versial phonological and morphological crite- 
ria (Fig. 4, SA6a). This moved the median 
Indo-European root date 804 years earlier 
(9.90% older). Separately, we applied higher- 
order constraints on the deepest relationships 
between all primary branches of Indo-European, 
to enforce a topology taken to support the 
Steppe hypothesis (5) (Fig. 4, SA6b). This moved 
the root date estimate 444 years earlier 
(5.47% older), further away from the steppe 
chronology. 

With previous Indo-European datasets, en- 
forcing ancestry constraints led to substantially 
younger root age estimates, enough to bring 
them into the time range predicted by the Steppe 
hypothesis (72). To test the impact of enforcing 
direct ancestry on our new IE-CoR dataset, 
we implemented three different ancestry- 
constrained analyses (SM section 7.5). In our 
main analysis, only four languages had >0.01 
support for being direct ancestors. Enforc- 
ing those as ancestry constraints, and even 
adding the next (Old English, with support at 
only 0.0024), had minimal effect on the root 
date distribution, shifting the median esti- 
mate later by just 46 years (0.57% younger) 
(Fig. 4, SA7b, and table S7). If, contrary to our 
findings, written Classical Latin is nonethe- 
less constrained to be directly ancestral to 
spoken Romance, the median root date moves 
later by 331 years (4.08% younger; Fig. 4, SA7a), 
to 7889 yr B.P.; but within Romance, the first 
splits to Romanian and Sardinian are then too 
late to be compatible with historical and lin- 
guistic indications (SM section 6.5). Even if we 
constrain all 27 IE-CoR languages remotely 
conceivable as direct ancestors, the root shifts 
later only by 506 years (6.23% younger), to 
7614 yr B.P. (6239 to 9182 yr B.P.; Fig. 4, SA7c). 
Therefore, with the IE-CoR dataset, ancestry 
constraints do not lead to radically younger 
root ages. 

This robustness to ancestry constraints is 
driven by the greater consistency of IE-CoR 
compared with the earlier Indo-European 
Lexical Cognacy (IELex) dataset (11, 12). To 
confirm this, we took the “broad” (72) subset 
of IELex with its associated clade constraints 
(72) and applied to it our main, ancestry- 
enabled analysis model and tree prior, with 
(SA8b) and without (SA8a) the eight suggested 
ancestry constraints (72). This confirmed that 
with IELex, unlike with our IE-CoR dataset, 
enforcing direct ancestry does move the me- 
dian root date estimate into a far more recent 
time frame, younger by 3632 years (42.1%), 
from 8629 yr B.P. (Fig. 4, SA8a) to 4997 yr B.P. 
(Fig. 4, SA8b). This contrast in the IELex data- 
set being far more sensitive to ancestry con- 
straints than our IE-CoR dataset is explained 
by comparing the terminal branch lengths to 
the putative ancestor languages in the ancestry- 
enabled analyses for each dataset (fig. S7.8). 
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Fig. 4. Posterior prob- 
ability distributions of 
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Indo-European com- M1 
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Early Vedic and Younger 
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order clade constraints. SA6b: With higher-order clade constraints following the Ringe topology (5). SA7a: With an 
ancestry constraint for Latin only. SA7b: With ancestry constraints for the five languages with a >0 posterior 

probability of being ancestral. SA7c: With all 27 remotely possible ancestry constraints. SA8a: Using the “broad” 
subset (12) of the IELex database with ancestors enabled but not enforced. SA8b: Using the “broad” subset (12) of 


the IELex database with ancestry enforced. SA9: With 57 meanings removed, those for which ancestral state 
reconstruction (on analysis M3) showed polymorphism per meaning at the root. SA10: Using a multistate model of 
cognate evolution. All sensitivity analyses SAl to SAO are based on model M3, the best-performing model. 


These terminal branches are far longer (in some 
cases by >3000 years) with the IELex “broad” 
dataset than with IE-CoR. This excess branch 
length is caused by large numbers of excess 
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entries in the IELex database, representing 
not just the primary word for a given meaning 
in any one language but one or more additional 
words similar in meaning (i.e., near synonyms) 
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although not the primary term (27). In IELex, 
these near synonyms had been entered highly 
inconsistently across the different languages 
(see fig. S1.4 and SM section 1.4). In a phylo- 
genetic analysis, these excess entries equate to 
additional gains (or losses) in cognate evolution. 
Where constraints force branch lengths to 
zero (i.e., direct ancestry), the artifactual gains 
or losses that would have fallen on these long 
terminal branches are instead pushed to oc- 
cur above the constrained ancestor language, 
after its time calibration. This in turn inflates 
the estimates of rates of change across the tree 
[from a median of 0.0055 (0.0046-0.0066) to 
0.0132 (0.0119-0.0145) changes per cognate 
set per thousand years], and these faster rate 
estimates result in younger root age estimates 
(12). With IE-CoR data, free of excess syno- 
nyms, results are much more robust to adding 
or removing ancestry constraints. A young 
age estimate for Indo-European resulted only 
from enforcing inappropriate ancestry con- 
straints on a problematic dataset. 

The artifacts that arise from excess syno- 
nyms are part of a wider methodological issue. 
Lexical evolution is multistate, but most phylo- 
genetic analysis methods take input data in 
binary format. IE-CoR follows strict protocols 
to ensure data consistency very close to a tar- 
get of only the single primary cognate set 
present per meaning per language. (IE-CoR 
can and does admit cases of absolute synon- 
ymy in meaning and usage, but these are rare.) 
To test for the impact of polymorphism, we 
used ancestral state reconstruction to identify 
any meanings for which our main covarion 
model did in fact “reconstruct” more than one 
cognate set per meaning at the root. In SAQ, 
we reran the main analysis but with these 
“root polymorphism” meanings excluded, leav- 
ing a remaining subset of 113 of the original 
170 IE-CoR meanings. The effect on the root 
age was minimal: just 255 years (3.11%) younger, 
at 7955 yr B.P. (6427 to 9436 yr B.P.; Fig. 4, SA9). 

A more radical alternative is to switch to a 
different phylogenetic model that does direct- 
ly take multistate characters as its input data, 
rather than binary ones. We devised a multi- 
state model and applied it to the IE-CoR data- 
set, as SA10. This model did return notably 
younger root date estimates: 2057 years (25.1%) 
younger, at 6153 yr B.P. (4926 to 7884 yr B.P.; 
Fig. 4, SA10), and thus within the range of the 
original Steppe hypothesis (5). This contrast re- 
sults in particular from a difference in how the 
models handle polymorphism. Our main bi- 
nary covarion model does in effect admit poly- 
morphism per meaning, where supported by 
the data (typically over a period of transition 
from one word to another as the primary term 
for a given meaning). For analysis SA10, how- 
ever, the multistate model required an as- 
sumption of no polymorphism at any stage in 
the tree. In multiple respects, the results indi- 
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cate that this represents a relatively serious 
model misspecification. Assessed against estab- 
lished classifications for the Indo-European 
family, the topology and the chronology (rela- 
tive and absolute) of the multistate tree are 
far more unexpected and problematic than 
the tree from the main binary covarion model. 
For example, the multistate model returns 
Tocharian as a late branch, deeply nested 
within the Indo-European tree together with 
Albanian, and fails to distinguish East from 
West Slavic correctly. 

Furthermore, in almost all cases where lan- 
guage splits can be historically dated, the 
multistate model seriously underestimates 
the time depth of those splits, compressing 
the chronology across the board. As a fur- 
ther qualitative performance benchmark, 
we used ancestral state reconstruction in 
BEAST2 to identify any innovations inferred 
on the terminal branch to each ancient lan- 
guage. The covarion model returned expected 
results, pinpointing cognate sets unique to 
individual language taxa. The multistate model 
failed to return many of these as innovations, 
clearly indicating a model misspecification and 
revealing why the multistate model underesti- 
mates time depths. We therefore retain our 


main results from the binary covarion model 
(see SM section 7.10 for details and further 
reasons). 


Interpretation 


Our robust support for a root date estimate of 
~8120 yr B.P. (6740 to 9610 yr B.P.) has major 
implications for the origins of the Indo- 
European family, the prehistory of Eurasia, 
and the interpretation of the latest aDNA re- 
sults. The Indo-European question centers on 
where the PIE ancestor language was origi- 
nally spoken, before any of its first branches 
diverged outward. The main rival theories are 
named and defined by where they place that 
ultimate homeland: the Steppe hypothesis or 
the Anatolian hypothesis (see Boxes 2 and 3). 
Ancient DNA findings do support major ex- 
pansions into north-central Europe out of not 
just the Pontic-Caspian Steppe (16) but also 
the Forest Steppe (39), dated to between 5000 | 
and 4500 yr B.P. and associated with the Corded 
Ware culture (/6). Our results show full sup- 
port (100% posterior probability) for some of 
the main European branches of Indo-European 
remaining in a deep common clade until ap- 
proximately this time depth. Germanic and 
Celtic are estimated to have diverged from 


Box 2. Linguistics, archaeology, and genetics. 


Although Indo-European is a linguistic concept, it was principally archaeologists who set out and 
developed the best-known competing theories on its origins: the Steppe hypothesis (7, 65, 69) and the 
Anatolian, or “farming,” hypothesis (6, 70). Most recently, aDNA has brought revolutionary new results 
and perspectives and can provide chronological constraints and estimates for the magnitude of migratory 


events in the past. 


Linguistics, archaeology, and genetics use very different data and methods, however. Their different, partial 
records of the past can complement each other, but correlating them is not straightforward. “Cultures” 
inferred from the archaeological record do not match one-to-one with languages. Similarly, both matches and 
mismatches can arise between linguistic and genetic lineages, because languages can spread either demically 
or culturally (see SM section 2.1.2) (9). Findings in one discipline thus do not constitute proof or direct support 
of those in another but can be less or more compatible with competing hypotheses for Indo-European 


prehistory. 


Speakers of Indo-European languages do not form a genetically homogeneous population. There is no 
single, consistent genetic profile from Iceland to Bangladesh. Realistically, only some partial ancestry 
component may be common to all or most speakers of Indo-European languages through time and 


space. Current debate boils down to which of two potential “tracer dyes” makes for the best fit with 


(Proto-)Indo-European. 


¢ The ancestry profile of Yamnaya culture populations on the Pontic-Caspian Steppe spread widely 
during the Bronze Age, from ~5000 yr B.P. This profile is a roughly equal (ad)mixture of two earlier 
ancestries: the Eastern (European) hunter-gatherer (EHG) ancestry originally dominant in Pontic- 
Caspian and the Caucasus hunter-gatherer (CHG)/Iranian Neolithic ancestry that admixed into the 


Pontic-Caspian from ~7000 yr B.P. 


¢ This CHG component alone is an alternative candidate for the Indo-European tracer dye. It is first 


found south of the Caucasus but from ~7000 yr B.P. onward also reached the Pontic-Caspian 


§ 


+ 


eppe. Unlike EHG, the CHG component was also high in Anatolia at the time of the Hittites, who 


spoke the Anatolian branch of Indo-European, and remains high among speakers of the Indo-lranic 


branch to this day. 


However, these ancestry components are themselves not static singular entities. Rather, they represent 
momentary snapshots in time in prehistory, each emerging from preceding forms, and mixtures thereof. 
Genetic ancestry is fluid and clinal, and a matter of resolution, and therefore challenging to track—and 
relate to language lineages—unambiguously over many millennia. 
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each other ~4890 yr B.P. (3720 to 6190 yr B.P.), 
and Italic from them somewhat earlier, ~5560 yr 
B.P. (4230 to 6980 yr B.P.). Balto-Slavic is less 
closely associated with these three, splitting 
earlier, ~6460 yr B.P. (5040 to 7940 yr B.P.). 

The Albanian, Greek, Armenian, and Ana- 
tolian branches, however, all separate from 
this main European clade much deeper in the 
tree—with mean age estimates long before 
“steppe” ancestry spread into Europe. So, in 
both chronology and phylogeny, this expan- 
sion from the steppe appears as a secondary 
phase that carried only some branches of 
Indo-European into Europe. This is consistent 
with aDNA findings in other regions that do 
not support the predictions of the hypothesis 
that all Indo-European originated on the steppe 
(43). Currently, aDNA evidence does not sup- 
port a migration from the steppe through the 
Balkans into Anatolia (20, 22), where traces of 
steppe ancestry are conspicuously absent in 
the Bronze Age (21-23). Steppe ancestry is also 
largely absent in ancient Greek Early Bronze 
Age individuals, who instead carry some 
Early European farmer-like ancestry, and 
~25% Caucasus hunter-gatherer/Iranian-like 
ancestry (19, 44). [The latter was first reported 
as maximized in hunter-gatherers from the 
South Caucasus (45) and early herders/farmers 
in northwestern Iran (46, 47), particularly 
the Zagros, hence the label “CHG/Iranian.”] 
Steppe ancestry up to 50% is attested in Greece 
only after ~4000 yr B.P. in Middle and Late 
Bronze Age (Mycenaean) individuals (19), 
with an admixture date estimate of ~4600 to 
4000 yr B.P. Ancient Armenians carry pre- 
dominantly a mix of mostly CHG/Iranian- 
like (40 to 60%) and Anatolian Neolithic-like 
ancestry (20 to 40%) and receive only a late 
contribution of steppe ancestry during the 
Late Bronze Age, ~3500 to 3000 yr B.P. [as 
indicated by the appearance of ~15% Eastern 
(European) hunter-gatherer (EHG) ancestry], 
which drops to low proportions at ~2000 yr B.P. 
(44, 46, 48). 

Steppe ancestry, in the form of a mix of 
EHG+CHG/Iranian-like ancestry, thus did 
not reach Greece and Armenia until long after 
the population movements into northern and 
central Europe out of the Pontic-Caspian Steppe 
and Forest Steppe ~5000 yr B.P. In our phylo- 
genetic results, Greek and Armenian show no 
close relationship to the main branches in Eu- 
rope that plausibly fit with expansion from the 
steppe: Germanic-Italic-Celtic and possibly Baltic- 
Slavic. Earlier, however, during the Chalcolithic 
and Eneolithic periods ~6500 to 5500 yr B.P., 
CHG/Iranian-like ancestry had already spread 
across Anatolia, the Caucasus, northern Meso- 
potamia, and southeastern Europe and had also 
come to form an integral part of the genomic 
landscape in the North Pontic region during the 
Steppe Eneolithic. This expansion of CHG/ 
Iranian-like ancestry represents an alternative 
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candidate for spreading early branches of 
Indo-European in these regions. 

Results from aDNA research thus cannot be 
fully reconciled with the idea that PIE, and all 
branches, ultimately originated on the steppe. 
Recent interpretations of the aDNA record 
(5, 49) nonetheless continue to follow a recent 
formulation of the Steppe hypothesis (5) that 
keeps the steppe as the ultimate homeland and 
posits a corresponding tree topology (5, 50, 57), 
albeit one that does not command linguistic 
consensus. In particular, in this hypothesis, 
Indo-Iranic, the major eastern branch of Indo- 
European, was one of the last two main branches 
to emerge, out of a final major clade with Balto- 
Slavic. Our results contradict this in both chro- 
nology and tree topology. Indo-Iranic branches 
off early, ~6980 yr B.P. (5650 to 8400 yr B.P.), 
and support for a common clade with Balto- 
Slavic is minimal, with a posterior probability 
of only 12.3%. Recent aDNA data from Central 
and South Asia have sought to trace movements 
of people into Western and South Asia by mi- 
grations southward from the steppe. However, 
for the period 4300-3700 yr B.P., samples from 
the Bactria-Margiana Archaeological Complex 
(BMAC) do not yet attest to any such south- 
ward migration (49). Steppe ancestry is not 
found until ~3500 yr B.P., in the Gandhara 
Grave Culture in northern Pakistan, and only 
at limited proportions (49). The interpretation 
that this ancestry can be identified with the 
first Indo-Iranic dispersal into South Asia (49) 
is not straightforwardly compatible with our 
earlier date for the separation of Indo-Iranic 
from the rest of Indo-European (~6980 yr B.P.). 
We also find that Indic and Iranic had diverged 
from each other already by ~5520 yr B.P. (4540 
to 6800 yr B.P.). To reconcile this with a steppe 
origin would require an alternative scenario in 
which Indic and Iranic split from each other 
approximately two millennia before entering 
South Asia and Western Asia. 

Our analysis indicates that the Indo-European 
family began with a series of major branch- 
ing events in relatively quick succession. From 
~8120 yr B.P. (6740 to 9610 yr B.P.) to 6140 yr 
B.P. (4540 to 7880 yr B.P.), Indo-European 
had split into seven branches (see Table 1 and 
fig. S6.1), long before “steppe” ancestry spread 
into Europe and the Altai. These seven include 
the Anatolian, Greco-Armenian, and Indo- 
Iranic branches, for which aDNA shows little 
or no genetic influx from the steppe at ~5300 
to 4900 yr B.P.—that is, at time depths early 
enough to match our estimated split times. 
Ancient DNA does, however, indicate a spread 
of CHG/Iranian ancestry in the opposite di- 
rection, from south of the Caucasus into the 
steppe at ~7000 to 6200 yr B.P. (48), which 
created the diagnostic “steppe” mix of ances- 
tries that would later also enter Europe, ~5000 
to 4500 yr B.P. This CHG/Iranian component 
is found first south of the Caucasus, including 
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in the north to northeastern arc of the Fertile 
Crescent, among early farmers on the flanks 
of the Zagros Mountains in western Iran (47). 
The same CHG/Iranian (48) ancestry compo- 
nent also admixes heavily (by ~5000 yr B.P.) 
(22, 23) into the region where languages of the 
Anatolian branch are first documented. CHG/ 
Iranian is the dominant ancestry in ancient 
Armenia and Iran, in BMAC, and in most 
present-day populations who speak languages 
of the Iranic branch. It is also a major ancestry 
component among speakers of the Indic branch, 
particularly in regions furthest from the Dravidian- 
speaking (i.e, non-Indo-European) south of India. 
Thus, it is the CHG/Iranian ancestry compo- 
nent that most strongly connects the past pop- 
ulations who potentially spoke the branches 
of Indo-European in Europe and south (and 
east) of the Caucasus. Our earlier date esti- 
mates for the separation of Indo-Iranic from 
other Indo-European languages (49, 52) arein | 
line with this scenario. 

Together, our linguistic results and the aDNA 
data are fully compatible with neither the 
Steppe hypothesis (Fig. 1B) nor the farming 
hypothesis (Fig. 1C). Instead, we propose a 
hybrid hypothesis (Fig. 1D) in which Indo- 
European languages spread out of an initial 
homeland south of the Caucasus, in the north- 
ern Fertile Crescent (Box 3). Only one major 
branch spread northward onto the steppe and 
then across much of Europe. This proposal 
matches parts of an existing alternative “South 
Caucasus” hypothesis (53-55), but the tree 
topology differs. The first migration phases are 
substantially earlier, and the main migration 
to the steppe follows a different route, through 
the Caucasus rather than through Central 
Asia. Crucially, south of the Caucasus is where 
aDNA first locates the only ancestry compo- 
nent found at high proportions in populations 
(past and present) associated with both Indo- 
Iranic and the main European branches of 
Indo-European. This genetic ancestry also em- 
erged in southeastern Europe during the Late 
Chalcolithic/Early Bronze Age and predated 
the spread of “steppe” ancestry. (The Paleo- 
Balkan branches of Indo-European were for- 
merly spoken in this region, but too few records 
survive to include them in our dataset.) Our 
hybrid hypothesis posits that out of this home- 
land south of the Caucasus, from ~8120 yr B.P., 
PIE began to diverge as early migrations split 
it into multiple early branches. One of these 
branches could have taken Indo-Iranic east- 
ward far earlier than the Steppe hypothesis 
presumes, but in line with the linguistic chro- 
nology in Fig. 3, in which Indo-Iranic emerged 
as a distinct branch in the early phases of 
Indo-European divergence. Another main 
branch reached the steppe directly northward 
through the Caucasus ~7000 to 6500 yr B.P., 
compatible with one current interpretation 
of the aDNA record (48). The steppe became 
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Box 3. What’s in a name? Shifting perceptions of the Steppe hypothesis. 


The Indo-European question centers on where the common PIE ancestor language was originally spoken, 
verged outward. The main rival theories are named and defined by where 
: the Steppe hypothesis (5) contrasts with both the Anatolian hypothesis (6) 


before any of its first branches di 
they place that ultimate homeland 


and a lesser-known Armenian hypothesis (53, 54). 


In the Steppe hypothesis, all branches of Indo-European ultimately go back to migrations out of the Pontic- 
Caspian Steppe. By definition, this has included a steppe origin for the Anatolian and Tocharian branches (5) 
Other hypotheses do recognize a prominent role for the steppe, as a staging post for some branches of 
stward (54)—or eastward, in Renfrew’s variant B (6). Nonetheless, these 
branches originated on the steppe. They instead posit that Indo-European 


Indo-European heading either we 
hypotheses reject the idea that all 


owes its full scale and diversity to cultural and demographic d 
Steppe but ultimately to earlier, deeper causes in lands farther south, in the southern Caucas' 


Fertile Crescent. 
Early aDNA results did support 


has grown, interpretations have 


branches, notably Indo-lranic (45) and especially Anatolian (21, 23, 24). 

Anatolian is often hypothesized as first to branch off from the rest of the family, followed by Tocharian. 
sus on this, but “Anatolian first” has led to alternative names and qual- 
land issue. If (only) extant or Late Indo-European emerged from the steppe, 
Tocharian did not, then strictly the steppe was not the original homeland. 
ndo-Anatolian” (23)—which reflects neither its geographic coverage nor 
s does not change the basic question of where the original homeland of 
latedness of Anatolian within the family is not in doubt, so if it (or any other 


There is no full linguistic consen 
ifications that can cloud the home 
whereas extinct Anatolian and/or 
Even if the family is rebaptized “ 
a particular branching order—thi 
the family as a whole was. The re 


one “massive migration’ out of the steppe, into parts of Europe, although it 
was qualified as “a” source for “at least some” Indo-European languages “in Europe” (16). As the aDNA record 


continued to hold back from identifying the steppe as the source of a 


evelopments not just on the Pontic-Caspian 
us or northern 


branches) did not originate on the steppe, then Indo-European origins lie not in the Steppe hypothesis proper 


but rather in some form of hybrid 


hypothesis. 


a secondary homeland for the later Yamnaya- 
and Corded Ware-related expansions into parts 
of Europe and north-central Asia. 

Our results do not directly identify by which 
route Indo-Iranic spread eastward, so it re- 
mains possible that this branch spread through 
the steppe and Central Asia, looping north 
around the Caspian Sea (Fig. 1D). Recent in- 
terpretations of aDNA argue for this (49, 52), 
but some aspects of their scenario are not easy 
to reconcile with our linguistic findings. For 
example, Indo-Iranic is an early independent 
branch in our analyses, with no close relation- 
ship to Balto-Slavic (see Box 1 and SM section 
7.6.2.1), so that argument in favor of a north- 
ern route falls away. Genetically, the ancestry 
of Indo-Iranic speakers also derives much more 
heavily from south of the Caucasus and from 
Neolithic Iran than from the Bronze Age steppe 
(16) (see Box 2). Previous interpretations of 


aDNA from one individual from the Indus | 


Periphery sought to exclude a direct eastward 
route on the basis of the degree and timing of 
Anatolian admixture (49, 52), but these have 
been superseded by methodological and ana- 
lytical refinements, which no longer exclude 
this scenario entirely (56). More parsimonious 
geographically, at least, would be a route for 
Indo-Iranic directly eastward out of a South 


eee Sy 
Table 1. Estimated time depths of the 12 main well-attested clades of Indo-European and higher-order clades with high posterior probability 
support. All date estimates are given in years before present, meaning before 2000 CE. The “time depth as independent clade” dates for [Balto-Slavic] + [Italic + 
Germanic + Celtic], Indo-lranic, Greco-Armenian, Anatolian, Tocharian, and Albanian are merely indicative, based on splits with <50% posterior support. Date estimates 
shown are the height_median and height_95%_HPD values in the MCC tree file; see also fig. S6.1. 


Major clade (with 
high posterior 
probability support) 


(Proto-)Indo-European 


[Balto-Slavic] + [Italic + 
Germanic + Celtic] 


enlan 


Albanian 


Time depth as independent 
clade (split from rest 
of Indo-European) 
Median Posterior 95% HPD 
(yr B.P.) probability (yr B.P.) 


6135 0.49 4540-7882 
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Time depth of divergence 
within clade (between 
languages attested) 


Median Posterior 95% HPD 
(yr B.P.) probability (yr B.P.) 
8116 6735-9613 


1067 1 468-1882 
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Caucasus homeland through the Iranian Pla- 
teau, south of the Caspian (Fig. 1D). 

Ancient DNA provides evidence of past 
population expansions over the same broad 
contexts in time and space that saw the Indo- 
European languages diverge and spread. These 
aDNA data suggest that the steppe did play a 
major role in spreading some of the European 
branches, but they also confirm that (at least) 
the Anatolian branch did not originate there. 
This thus points to an ultimate homeland for 
the Indo-European family south of the Caucasus 
(23). The obvious remaining question is 
whether all branches other than Anatolian 
came from the steppe, or only some. For some 
branches, the past population expansions and 
admixture events detected in aDNA, and hy- 
pothesized as having spread those forms of 
Indo-European, had only limited genetic im- 
pact. Our Bayesian phylogenetic analyses 
show that those candidate population expan- 
sions also postdate the linguistic divergences. 
Ancient DNA and linguistic phylogenetics 
thus combine to suggest that the resolu- 
tion to the 200-year-old Indo-European enigma 
lies in a hybrid of both the farming and Steppe 
hypotheses. 


Methods summary 
Linguistic methodology 


The IE-CoR database stores data on cognate 
relationships (shared word origin) between 161 
Indo-European languages, in a reference set of 
170 basic meanings. Across these languages and 
meanings, IE-CoR has a total of 25,918 individ- 
ual lexeme entries. These lexemes are analyzed 
into 5013 cognate sets. The linguistic data and 
supporting citations can be explored and down- 
loaded at iecor.clld.org. 

Databases used in previous phylogenetic 
analyses have been undermined by a series of 
identifiable failings. To solve these, IE-CoR 
introduces a series of innovations in the meth- 
odology of database design, data collection, and 
the coding of language data, for both vertical 
(cognate) and horizontal (oanword) transmis- 
sion. First, in coverage of language taxa, IE-CoR 
sampling provides denser coverage of the Indo- 
European family: 161 languages, as opposed to 
24 (51), 84 (57), 87 (10), 103 (11), and 52, 82, or 
94 (12) languages in previous databases [for a 
comparative table, see table 1 in (27)]. Sam- 
pling is also more balanced across all main 
branches of the Indo-European family and 
fills in gaps in the geographical coverage of 
previous databases. IE-CoR does now cover, 
for example, extinct Iranic languages of the 
steppe and Central Asia, the Nuristani branch 
of Indo-Iranic languages, and Gaulish as a 
representative of ancient Continental Celtic. 
Coverage also prioritizes nonmodern languages 
(52 in IE-CoR), to provide deeper phylogenetic 
signal and a fuller range of calibration points 
for the chronological estimation. 
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The linguistic data in previous databases 
were encoded essentially by a single linguist 
(51, 57) and have been criticized for poor data 
quality (58). IE-CoR coordinated more than 
80 specialists in the languages and branches 
concerned. Past database methodology also 
led to datasets being inconsistently coded. In 
particular, some languages were encoded with 
a proliferation of synonymous lexeme entries. 
This created wide disparities in the number 
of cognate sets present per language (fig. S1). 
These disparities can skew the estimations 
of branch lengths, rates of evolution, and chro- 
nology in phylogenetic outputs (27) (SM sec- 
tion 1.4). IE-CoR applies a strict and low 5% 
tolerance limit for multiple synonymy, as well 
as anew methodology to minimize scope for 
data inconsistency across all coders, languages, 
and meanings. Data coding procedures follow 
explicit new consistency protocols for both 
lexeme determination in each language and 
cognate determination between languages. 
The IE-CoR set of 170 reference meanings was 
itself optimized, first with reference to quan- 
titative analyses of worldwide stability and 
borrowability of lexical meanings (59), and sec- 
ondly by applying the same IE-CoR consistency 
protocols to systematize the (re)definitions 
of all meanings, to give a narrower and un- 
ambiguous specification of the exact target 
sense of each. Finally, loanwords are instances 
of horizontal transmission between languages 
and thus a potential confound to phylogenet- 
ic analyses. IE-CoR introduces a methodol- 
ogy to address inadequacies in how previous 
datasets have analyzed loanwords. In partic- 
ular, new data structures distinguish the differ- 
ent consequences, for phylogenetic analysis, 
when loan events either give rise to indepen- 
dent cognate sets of their own or drive parallel 
changes across multiple, already divergent lan- 
guages. This database methodology is presented 
in full in the supplement (SM section 3). 


Phylogenetic analysis 


We use Bayesian phylogenetic inference (60) 
to estimate root ages and how many ancient 
languages are “sampled ancestors” (i.e., di- 
rectly ancestral to modern ones). For details 
on the method, see (61). Specific details for 
the application to cognate data can be found 
in the supplementary materials of analogous 
previous work (11, 62). Previous phylogenetic 
analyses of cognate data have assumed that no 
language in the dataset was directly ancestral 
to any other (J0, 11, 63). Forcing the opposite 
assumption—that many ancient languages 
were directly ancestral—returned significantly 
different root estimates (12) as well as un- 
tenable clade age estimates in known histor- 
ical cases. In this study, we employed a method 
that uses reversible jump proposals during the 
Markov chain Monte Carlo run, allowing an- 
cient languages to switch from being ancestral 
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to nonancestral, and vice versa (25). In this 
approach, the posterior probability that an 
ancient language is ancestral is the propor- 
tion of the posterior sample in which it is 
ancestral. The actual proportion does not ne- 
cessarily fit the assumption that it is either 
zero (JO, 11, 63) or 1. 

Following earlier work (11, 62, 63), we used 
the covarion model (64) as a substitution mod- 
el, and an uncorrelated relaxed clock with a 
log-normal distribution (26). We used path 
sampling (41) to a range of setups for the sub- 
stitution model and obtained the best fit when 
the 170 IE-CoR meanings were binned by the 
number of cognate sets per meaning, and each 
bin was associated with a different mutation 
rate (fig. S5.3). The tree prior was parameter- 
ized by the quotient of a diversification rate 
and an extinction rate, the extinction rate it- 
self, a sampling proportion through time, and 
a sampling probability at present (72). To- 
gether, these parameters drive the process that 
generates the tree, leading to older or younger 
trees, and more or fewer sampled ancestors. 
We assumed that the diversification rate y 
and the extinction rate A are of the same order 
of magnitude (log-normal prior distribution 
with mean 0 and standard deviation 1 applied 
to the quotient y/A). We applied a highly con- 
servative Exp(0.2) prior distribution on the ex- 
tinction rate, which translates to an average 
time to lineage extinction of 5000 years. 

To estimate the sampling proportion, three 
time periods need to be considered: the time 
before 4400 yr B.P., when no ancient lan- 
guages are sampled, where the sampling pro- 
portion is zero; the time after the youngest 
nonmodern language, after which the sam- 
pling proportion is also zero; and the time be- 
tween those two boundaries, when ancient 
languages were indeed sampled. This “ancient 
sampling proportion” is bound by an unin- 
formative uniform prior distribution between 
0 and 1. The sampling probability at present 
(what proportion of all contemporary lan- 
guages are actually covered in the IE-CoR 
database) is bound by an informative beta 
distribution ([109,400]), which assumes that 
the modern languages in our dataset are a 
subset of about 400 to 600 contemporary 
Indo-European languages. We also assumed 
that the origin—the start of the branch above 
the root of the tree—does not exceed 10,000 yr 
B.P., as an upper bound on the beginning of 
divergence between Indo-European languages. 
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NEURODEGENERATION 


TRIMI1 protects against tauopathies and is 
down-regulated in Alzheimer’s disease 


Zi-Yang Zhang, Dilshan S. Harischandra, Ruifang Wang, Shivani Ghaisas, Janet Y. Zhao, 
Thomas P. McMonagle, Guixin Zhu, Kenzo D. Lacuarta, Jianing Song, John Q. Trojanowski, 


Hong Xu, Virginia M.-Y. Lee, Xiaolu Yang* 


INTRODUCTION: Tauopathies encompass Alzhei- 
mer’s disease (AD), the most common form of 
dementia, as well as more than 20 other neuro- 
degenerative disorders. These diseases are path- 
ologically defined by intracellular neurofibrillary 
tangles (NFTs) composed of the hyperphosphoryl- 
ated and filamentous form of the microtubule- 
associated protein tau. It is known that mutations 
in the tau-encoding gene APT cause a heritable 
subset of tauopathies. However, how tau is con- 
verted from soluble monomers to insoluble 
fibrillar aggregates remains enigmatic. This 
conspicuous gap in our knowledge hinders the 
development of mechanism-based therapies 
for this large group of diseases. 


RATIONALE: In both sporadic and heritable tau- 
opathies, the accumulation of tau aggregates 
occurs in an age-dependent manner, suggest- 
ing that cellular factors could function early in 
life to inhibit tau misfolding and aggregation. 
Organisms in all kingdoms of life rely on pro- 
tein quality control (PQC) systems to remove 
defective or superfluous normal proteins, to 
prevent protein misfolding and aggregation, 
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and to dissolve preexisting protein aggregates. 
Recent studies suggested that tripartite motif 
(TRIM) proteins, which are found only in meta- 
zoans, may participate in multiple aspects of 
PQC in these highly complex life forms. Here, 
we investigated the TRIM system to look for its 
potential role in the pathogenesis of tauopathy, 
investigate its underlying mechanisms, and test 
its potential utility as therapeutic agent. 


RESULTS: We examined more than 70 human 
TRIMs and observed that TRIM10, TRIM55, 
and especially TRIM11 exhibited a strong ability 
to remove aggregates formed by a disease- 
associated tau mutant. An analysis of these 
three TRIMs in postmortem brain tissue from 
23 AD individuals and 14 age- and sex-matched 
control individuals revealed that the levels of 
TRIM11 protein, but not mRNA, were substan- 
tially reduced in AD brains compared with 
control brains, whereas the levels of TRIM10 
and TRIM55 mRNA and protein levels remained 
unchanged. 

Mechanistically, TRIM11 has three activities 
relevant to tau. First, it mediates the turnover 
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Down-regulation of TRIM11 in AD brains and potential therapeutic benefit of its restoration. 

(A) TRIM11 expression is inversely correlated with tau pathology. (B) TRIM11 promotes the degradation of 
mutant and excess normal tau and also functions as a molecular chaperone and a disaggregase, thus 
enhancing tau solubility. (©) TRIM11 is neuroprotective, and AAV-mediated intracranial delivery of TRIM11 


rescues multiple mouse models of tauopathy. 
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this by binding to tau, especially the mul—- 
variants or hyperphosphorylated species, and 
promoting their SUMOylation, leading to their 
proteasomal degradation. Therefore, TRIM11 
represents a crucial link between tau and the 
proteasome. Second, TRIMII functions as a 
molecular chaperone for tau, obviating tau mis- 
folding and aggregation. Third, TRIM11 is a 
disaggregase for tau, dissolving preexisting 
tau deposits including the intractable fibrillar 
aggregates. TRIM11 chaperone and disaggregase 
activities do not rely on ATP and operate efficiently 
even at low substoichiometric TRIM11/tau ratios. 

In various cell models, TRIM11 protects against 
spontaneous and seeded aggregation of intra- 
cellular tau, keeping tau in its functional soluble 
form. In cultured primary neurons, down- 
regulation of endogenous TRIM11 impairs, 
whereas increased expression of TRIM11 pro- 
motes, neuronal viability and the formation of . 
presynaptic and postsynaptic puncta. Thus, 
TRIM11 is an important neuroprotective factor. 

To examine the role of TRIM11 in the mam- 
malian brain and to evaluate its therapeutic 
potential, we used multiple mouse models of 
tauopathy, including PS19 mice, which express 
the familiar tau mutant P301L; PS19 mice injected 
with preformed tau fibrils to accentuate dis- 
ease phenotypes; and 3xTg-AD mice, which 
express tau P301L and familial mutations in 
two AD-related proteins, the amyloid pre- 
cursor protein and presenilin 1. TRIM11 was 
delivered through an adeno-associated viral 
(AAV) vector locally to the hippocampus or 
globally in the brain through the cerebrospinal 
fluid. In these models, TRIM11 suppresses tau 
pathology and neuroinflammation and im- 
proves cognitive and motor ability. 


CONCLUSION: These data suggest that TRIM11 
plays an important role in protecting against 
tauopathies and that its down-regulation might 
contribute to the pathogenesis of these diseases. 
Our results also highlight a potent, metazoan- 
specific PQC system with individual components 
that can perform multiple tasks using mech- 
anisms distinct from those of canonical, ATP- 
dependent PQC systems. The up-regulation of 
TRIM11 through small molecules might be fea- 
sible given that its expression appears to be 
highly regulated. Moreover, our findings pro- 
vide a proof of concept for the TRIMII gene 
itself as a therapeutic agent, bolstering PQC 
capacity and thus addressing the root cause 
of various neurodegenerative tauopathies. 
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TRIMI1 protects against tauopathies and is 
down-regulated in Alzheimer’s disease 


Zi-Yang Zhang“{, Dilshan S. Harischandra‘{+, Ruifang Wang"+, Shivani Ghaisas'++, Janet Y. Zhao’s, 
Thomas P. McMonagle’4], Guixin Zhu#, Kenzo D. Lacuarta’, Jianing Song’**, John Q. Trojanowski*{+, 


Hong Xu’, Virginia M.-Y. Lee, Xiaolu Yang!* 


Aggregation of tau into filamentous inclusions underlies Alzheimer’s disease (AD) and numerous other 
neurodegenerative tauopathies. The pathogenesis of tauopathies remains unclear, which impedes the 
development of disease-modifying treatments. Here, by systematically analyzing human tripartite motif 
(TRIM) proteins, we identified a few TRIMs that could potently inhibit tau aggregation. Among them, 
TRIM11 was markedly down-regulated in AD brains. TRIM11 promoted the proteasomal degradation of 
mutant tau as well as superfluous normal tau. It also enhanced tau solubility by acting as both a 
molecular chaperone to prevent tau misfolding and a disaggregase to dissolve preformed tau fibrils. 
TRIM11 maintained the connectivity and viability of neurons. Intracranial delivery of TRIM11 through 
adeno-associated viruses ameliorated pathology, neuroinflammation, and cognitive impairments in multiple 
animal models of tauopathies. These results suggest that TRIM11 down-regulation contributes to the 
pathogenesis of tauopathies and that restoring TRIM11 expression may represent an effective 


therapeutic strategy. 


ntracellular neurofibrillary tangles (NFTs) 

composed of hyperphosphorylated forms 

of the microtubule-associated protein tau 

is the pathological hallmark shared by >20 

heterogeneous dementias and movement 
disorders, collectively referred to as tauopathies 
(7-3). Among them, progressive supranuclear 
palsy, corticobasal degeneration, Pick’s disease, 
and many others are primary tauopathies that 
display no other major pathological abnormali- 
ties, whereas Alzheimer’s disease (AD), the most 
common cause of dementia, is a secondary 
tauopathy additionally characterized by the 
presence of extracellular amyloid B (AB) plaques 
(4, 5). In a familial subset of primary tauopathies, 
the tau-encoding gene MAPT is mutated (6-8). 
In both primary tauopathies and AD, NFT bur- 
den correlates with cognitive decline and neuro- 
degeneration (9-11). Moreover, tau is required 
for AB-induced neurotoxicity (72). Therefore, tau 
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misfolding and aggregation likely represent the 
main disease-causing event for AD and the other 
tauopathies. 

To maintain proteins in their functional 
soluble form, organisms in all kingdoms of life 
have evolved protein quality control (PQC) 
systems (73-15). These systems include degra- 
dative pathways that recycle defective proteins 
and superfluous normal proteins, molecular 
chaperones that prevent protein misfolding 
and aggregation, and disaggregases that dis- 
solve preexisting protein deposits. The conver- 
sion of tau from soluble monomers to fibrillar 
aggregates in tauopathies in an age-dependent 
manner suggests a diminishing capacity of a 
PQC system that can normally protect against 
tau aggregation. Nevertheless, the identity and 
nature of such a PQC system remain undefined. 

Tripartite motif (TRIM) proteins are charac- 
terized by an RBCC region containing a RING 
domain, one or two B-box motifs, and a coiled 
coil (fig. S1) (16, 17). These proteins are present 
only in metazoans and their number, including 
>70 in humans, has expanded substantially 
during evolution. Recent studies suggest that 
some TRIM proteins may participate in PQC 
(18-21), although other TRIMs can aggravate 
protein aggregation (22). Moreover, TRIM21, a 
cytoplasmic antibody receptor, mediates the 
degradation of antibody-coated viruses or extra- 
cellular proteins that manage to enter the 
cytoplasm or intracellular proteins for which 
specific antibodies can be delivered (23-27). Here, 
we examined the role of the TRIM system in 
the pathogenesis of tauopathy, investigated its 
mechanism of action, and explored its utility 
for disease treatment. 
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Systematic analysis of TRIM proteins 

To determine the effect of TRIM proteins on 
tau, we combined two approaches: (i) systemat- 
ically analyzing virtually all known human 
TRIMs for their capability to remove tau 
aggregates in cultured cells and (ii) for TRIMs 
that exhibit a potent effect, comparing their 
expression in postmortem brain tissues from 
AD and control individuals. For the systema- 
tic analysis, we cloned 75 TRIMs individually 
into a mammalian expression vector (table S1). 
Each TRIM was introduced in human embry- 
onic kidney 293T (HEK293T) cells together 
with GFP-tau P301L, an enhanced green fluo- 
rescence protein (GFP) fusion of the longest 
isoform of human tau (441 residues with two 
N-terminal inserts and four microtubule- 
binding repeats, 2N4R) carrying P301L, a muta- 
tion associated with familial tauopathies (6, 8). 
GFP-tau P301L alone generated aggregated 
species that were insoluble in non-ionic deter- 
gent (Fig. 1, A to C). When coexpressed with 
GFP-tau P301L, most TRIMs were unable to 
clear GFP-tau P301L aggregates. However, 
three TRIMs (TRIM10, TRIM11, and TRIM55) 
displayed a robust effect, reducing GFP-tau 
P301L aggregates nearly completely, whereas 
two TRIMs, TRIM26 and TRIM36, displayed a 
relatively moderate effect (Fig. 1, A to C). 

To further examine the effect of these five 
TRIMs on tau, we used two neural cell lines, 
SH-SY5Y and Neuro-2a (N2a). TRIM10, TRIM11, 
TRIM36, and TRIM55 strongly reduced GFP-tau 
P301L aggregates in both SH-SY5Y (Fig. 1, D and 
E, and fig. S2A) and N2a (fig. S2, B and C) cells 
and also reduced GFP-tau P301L aggregates in 
N2a cells insoluble in the zwitterionic detergent 
sarkosyl (fig. S2, D and E), which likely con- 
tained filamentous tau (28). By contrast, TRIM26 
showed minimal or moderate activity (Fig. 1, D 
and E, and fig. S2, A to E). With the exception 
of TRIM36 in HEK293T cells, none of these 
TRIMs reduced the levels of GFP-tau P301L 
transcript in HEK293T, SH-SY5Y, or N2a cells 
(fig. S2, F to H). 

Conversely, we knocked out each of these 
five TRIMs in HEK293T cells by means of 
CRISPR-mediated gene editing. Knockout of ~ 
TRIM1O0, TRIM11, or TRIM55 increased GFP-tau 
P301L aggregates by ~80 to 130% without 
altering GFP-tau P301L mRNA levels, whereas 
knockout of TRIM36, as well as TRIM26, had 
no effect on GFP-tau P301L aggregates (Fig. 1, 
F and G, and fig. S3, A and B). We also knocked 
down these TRIMs in N2a cells using small 
interfering RNA (siRNA). Knockdown of TRIM10, 
TRIM11, or TRIM55 increased GFP-tau P301L 
aggregates by ~50 to 180% without affecting 
GFP-tau P301L mRNA, whereas knockdown of 
TRIM36 had no effect on GFP-tau P301L aggre- 
gates (Fig. 1, H and I, and fig. S3, C to H). For 
overexpression and knockout or knockdown, 
TRIM11 consistently displayed the strongest 
outcome. These results indicate that TRIM10, 
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Fig. 1. Screening of TRIM proteins. (A to E) Immunoblot of HEK293T [(A) and (B)] or SH-SY5Y (D) cells cotransfected with GFP-tau P301L and control vector 


(-) or the indicated TRIMs and quantification of relati 


ve GFP-tau P301L(PE)/HSP90 ratios [(C) and (E)]. WCL, whole-cell lysates. The expected full-length TRIM 


bands are indicated by blue arrowheads. In (A) and (B), TRIM proteins that substantially reduced insoluble GFP-tau P301L species are labeled in red boxes. 
(F to I) Immunoblot of TRIM-knockout HEK293T (F) or TRIM-knockdown N2a (H) cells transfected with GFP-tau P301L and quantification of relative GFP-tau 


P301L(PE)/HSP90 ratios [(G) and (l)]. Data are show 
TRIM55, and especially TRIM11 have potent 
activity to clear mutant tau. 


Down-regulation of TRIM11 in AD brains 
To evaluate whether TRIM10, TRIMII, or TRIM55 


n as means + SD (n = 3). *P < 0.05, **P < 0.01, ***P < 0.001, unpaired Student's t test. 


compared their RNA and protein levels in post- | to four antibodies that recognize abnormally 


might be down-regulated in AD brains, we 
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mortem brain tissues from 23 sporadic AD indi- 
viduals and 14 age- and sex-matched control 
individuals with no known neurodegenerative 
diseases (Fig. 2A and table S2). Tau pathology 
in the AD samples was verified by the reactivity 
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phosphorylated tau (p-tau) species (Fig. 2B and 
fig. S4A), as well as the formation of high- 
molecular-weight tau species that were resist- 
ant to the anionic detergent sodium dodecyl 
sulfate (SDS) (fig. S4B). Each of the TRIMZ70, 
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A Subject demographic data 
Age (years) PMI (h) Sex Braak stage Brain weight (g) 
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Fig. 2. Down-regulation of TRIM11 in sporadic AD brains. (A) Summary of 
demographic data of control and AD subjects used in this study (means + SEM). 
Information on individual subjects is provided in table S2. (B to D) Immunoblot of 
postmortem frontal cortex gray matter from 14 control and 23 AD individuals (B) 
and relative levels of TRIM1O and TRIM55 (C) and TRIM11 (D) in AD versus 
control samples. The #1 contro! and #1 AD samples were used in each blot for 
comparison between blots. p-tau species were modified at residues S202/T205 
(reactive to AT8), S262, T231 (reactive to AT180), and S396/S404 (reactive to 
PHF-1). (E and F) Representative IHC images of TRIM11 and AT8-reactive p-tau in 


TRIM11, and TRIM55 transcripts was pres- 
ent at similar levels in AD and control tis- 
sues (fig. S4C). Levels of TRIM10 or TRIM55 
protein were also comparable in these tissues 
(Fig. 2, B and C). 
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However, TRIM11 protein was present at a 
substantially lower level in AD compared with 
control tissues, with an ~55% reduction on 
average (Fig. 2, B and D, and fig. $4, D and E). 
This reduction was corroborated by immuno- 
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frontal cortices [(E); scale bar, 50 um] and quantification of TRIM11 and AT8 
signals [(F); mean + SD, n = 4]. (G and H) Representative IF images of TRIM11 
and NeuN in frontal cortices of control and AD samples [(G); scale bar, 10 um] 
and quantification of TRIM11 signal normalized to numbers of neurons [(H); mean 
+ SD, n = 4]. Individual neurons are indicated by white arrows. (I and J) Negative 
correlation between levels of TRIM11 and different p-tau species among AD and 
control tissues (I) or AD tissues only (J). r and p values of the Pearson correlation 
coefficient are shown. *P < 0.05; **P < 0.01; ****P < 0.0001; ns, not significant; 
unpaired Student's t test. 


histochemistry (IHC) (Fig. 2, E and F), as well 
as by immunofluorescence (IF) controlled for 
the number of neurons (Fig. 2, G and H) or total 
neural cells (fig. S4, F and G). These observa- 
tions, along with the comparable abundance of 
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the TRIM11 transcript in AD and control brains 
(fig. S4C) and the unmatched TRIM11 mRNA 
and protein levels in individual AD samples 
(fig. S4H), suggested that TRIM11 reduction in 
AD brains preceded neuronal loss. 

Across control and AD brain tissues, there 
was a strong inverse correlation between TRIM11 
expression and levels of each of the four p-tau 
species (Fig. 21). Among the AD tissues, there 
was also an inverse correlation between TRIM11 
expression and levels of p-tau species (Fig. 2J 
and fig. S41). These results indicate that TRIM11 
is down-regulated during sporadic AD patho- 
genesis with a strong association to tau patho- 
logy, and that the change in TRIMI11 expression 
is probably caused by a post-transcriptional 
mechanism. Single nucleotide polymorphisms 
in the TRIMII gene are linked to rare cases of 
progressive supranuclear palsy, a prevalent form 
of sporadic tauopathy (29). Therefore, we focus 
on TRIM11 in the studies described below. 


TRIM11 promotes tau degradation 


The strong ability of TRIM11 to clear tau aggre- 
gates, as well as its down-regulation in AD 
brains, prompted us to investigate its mechanism 
of action. We first evaluated whether TRIM11 
targets tau for degradation, a possibility sug- 
gested by the screening of TRIMs (Fig. 1 and 
figs. S2 and S3). Tau is a substrate of the ubiquitin- 
proteasome system (30, 37), and accumulation 
of insoluble tau impairs proteasome activity, 
further exacerbating tau pathology (32). The 
commitment to proteasomal degradation occurs 
at the level of substrate recognition, yet how tau 
is specifically recognized for proteasomal degra- 
dation remains unclear. 

Upon coexpression in HEK293T cells, TRIM11 
reduced levels of GFP-tau P301L in a dose- 
dependent manner (Fig. 3A). A cycloheximide 
chase assay showed that TRIM11 accelerated 
the turnover of aggregated GFP-tau P301L and 
prolonged its half-life (Fig. 3B and fig. S5A). To 
corroborate this observation, we introduced 
mCherry or mCherry plus TRIM11 in QBI293 
cells that express tau. P301L-GFP in a doxycycline 
(Dox)-inducible manner (QBI293/tau P301L- 
GFP cells) (33). Upon induction and then termi- 
nation of tau P301L-GFP synthesis, the turnover of 
preexisting tau P301L-GFP and p-tau P301L-GFP 
was accelerated in TRIM11/mCherry-expressing 
cells compared with mCherry-expressing cells 
(Fig. 3C and fig. S5B). Conversely, we knocked 
out endogenous TRIM11 and observed a slow- 
down in GFP-tau P301L degradation (fig. S5, C 
and D). 

Treatment with the protein phosphatase 
inhibitor okadaic acid (OA) increased the abun- 
dance of p-tau species (34) (Fig. 3D and fig. 
S5E). TRIMI11 reduced levels of OA-induced 
p-tau species and nearly completely cleared them 
in the insoluble fraction (Fig. 3D and fig. S5, F 
to I). TRIM11 also decreased levels of over- 
expressed wild-type tau (Fig. 3A) and accel- 
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erated its turnover (fig. S5, J and K), whereas 
knocking out TRIM11 by CRISPR-mediated 
gene editing slowed turnover of overexpressed 
wild-type tau (fig. $5, L and M). TRIM11- 
mediated degradation of mutant and wild- 
type tau was prevented by the proteasome 
inhibitor MG132, but not by the lysosome in- 
hibitor NH,Cl (fig. S6, A and B). These results 
indicate that TRIM11 targets mutant and hyper- 
phosphorylated tau, as well as superfluous 
normal tau, for proteasomal degradation. 


TRIM11 binds to and SUMOylates tau 


To evaluate whether TRIM11 interacts with tau, 
we performed a bimolecular fluorescence com- 
plementation (BiFC) assay based on the fluo- 
rescent protein Venus (35). We fused TRIM11 
and tau to the N- and C-terminal fragments of 
Venus, respectively, generating TRIM1I-VN and 
tau-VC (fig. S6C, left panel). When TRIM11-VN 
and tau-VC were expressed together but not 
individually, fluorescence was detected (Fig. 3E 
and fig. S6D). We also generated TRIM11-VC 
and tau-VN (fig. S6C, middle panel) and again 
observed fluorescence only when these fusions 
were expressed together (fig. S6, E and F). Thus, 
TRIM11 and tau interacted in cells, bringing the 
VN and VC moieties into close proximity to re- 
constitute Venus. 

The interaction between exogenous TRIM11 
and tau in cells was also detected by a co- 
immunoprecipitation assay (Fig. 3F and fig. 
S6G). This interaction was stronger upon treat- 
ment with OA (Fig. 3F) or when wild-type tau 
was replaced with tau P301L (fig. S6G) or tau 
ATS8, a tau mutant (S199E/S202E/T205E) that 
mimics the AT8-reactive p-tau (fig. S6H). In a 
cell-free assay with purified recombinant pro- 
teins (fig. S6I), tau and especially tau P301L 
was pulled down by GST-TRIM11 but not GST 
(Fig. 3G), indicating that TRIM11 can directly 
bind to tau. 

Endogenous TRIM11 and tau colocalized 
with each other in SH-SY5Y and N2a cells as 
shown by IF, and this colocalization was in- 
creased after OA treatment (Fig. 3, H and I, 
and fig. S7, A to C). They also interacted with 
each other in SH-SY5Y and N2a cells as shown 
by a proximity ligation assay (PLA); this inter- 
action was again increased in OA-treated cells 
(Fig. 3, J and K, and fig. S7, D to F). OA treat- 
ment noticeably elevated TRIM11 protein and 
mRNA levels (fig. S7, G to J). These results 
indicate that TRIM11 interacts with tau, pref- 
erentially mutant or hyperphosphorylated 
species. 

TRIM proteins have SUMO E3 ligase activ- 
ity (8, 21, 36). For the nucleus-localized TRIM19 
(also known as PML), this activity promotes 
conjugation of nuclear misfolded proteins to 
poly-SUMO2/3 chains, permitting ubiquitina- 
tion of these misfolded proteins by SUMO- 
targeted ubiquitin ligases and their subsequent 
degradation in the proteasome (78). TRIM11, like 
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tau, is primarily a cytoplasmic protein (20, 27). 
TRIM11 increased SUMOylation of tau in cells, 
and this activity was more pronounced for tau 
P301L than for tau (fig. S8A). In a cell-free assay 
with purified recombinant proteins (fig. S61), 
TRIM11 directly SUMOylated tau, especially 
the mutant form (Fig. 3L and fig. S8B). By con- 
trast, TRIM117““, in which the two conserved 
Glu residues proximal to the RING domain 
were replaced with Ala (27), exhibited no such 
activity (Fig. 3L). TRIM11™ also failed to pro- 
mote tau degradation (Fig. 3M). These results 
indicate that TRIM11 SUMOylates tau, promot- 
ing its degradation in the proteasome. 


TRIM11 enhances tau solubility 


Although defective proteins and superfluous 
normal proteins are removed by degradative 
pathways, by far most tauopathy cases are 
sporadic, and in these cases normal tau protein 
expressed at physiological levels forms fibrillar _ 
aggregates (4). To maintain protein solubility, 
cells use molecular chaperones and disaggre- 
gases, which prevent and reverse protein aggrega- 
tion, respectively (37-39). TRIMII preferentially 
removed insoluble tau, thus increasing the frac- 
tion of soluble tau molecules (e.g., see fig. S9, A 
and B). This was the case even under condi- 
tions when the total amounts of tau protein 
were comparable in the presence or absence of 
TRIM11 (Figs. 3D and 4A and fig. S9C). TRIM11- 
mediated tau solubilization was evident when 
cells were treated with MG132 or NH, (Cl (Fig. 4:4), 
indicating its independence from tau degrada- 
tion. Moreover, under conditions in which total 
GFP-tau P301L levels remained comparable, 
TRIM11 substantially reduced the amount of 
abnormally phosphorylated GFP-tau P301L 
species in both soluble and insoluble fractions 
(fig. S9D). 

Aggregation of tau occurs through a rate- 
limiting nucleation step that involves its self- 
association (40). To assess the effect of TRIM11 
on tau self-association in cells, we performed 
a BiFC assay in which tau was fused to VN or 
VC (4D) (fig. S6C, right panel). When tau-VN 
and tau-VC were expressed together but not 
individually, fluorescence signal was produced, 
indicating tau self-association (Fig. 4, B and C). 
When present at low levels that did not alter 
tau abundance, TRIM11 noticeably reduced the 
fluorescence signal (Fig. 4, B and C). Conversely, 
knockdown of TRIM11 increased the fluores- 
cence signal in cells expressing tau-VN and 
tau-VC (fig. S9, E and F). Knockout of TRIM11 
had a similar effect (fig. S9, G and H). These 
results indicate that TRIM11 enhances tau 
solubility and prevents its self-association. 


TRIM11 inhibits the seeding of tau aggregates 


Tauopathies are characterized by focal forma- 
tion of tau aggregates and their spread along 
interconnected neuronal regions (42-44). Reflect- 
ing this prion-like, self-templating propagation, 
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Fig. 3. TRIM11 binds to tau and targets it for proteasomal degradation. TRIMI-VN and tau-VC interaction in HEK293T cells. Scale bar, 100 um. 
(A) Levels of sarkosyl-insoluble (PE) and sarkosyl-soluble (SN) tau and tau P301L_~—(F) Interaction of Flag-TRIM11 and GFP-tau in HEK293T cells treated with or 
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comparison. CHX, cycloheximide. (©) Turnover of tau and p-tau in QBI293/tau 
P301L-GFP cells stably expressing mCherry or mCherry plus TRIM11. (D) Levels 
of GFP-tau and p-GFP-tau when expressed alone or together with TRIM11 in 
HEK293T cells that were subsequently treated with or without OA (100 nM). 
To achieve comparable levels of GFP-tau, the amount of GFP-tau plasmid was 
increased when expressed together with TRIM11 (fig. S5E). (E) BiFC assay of 
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Fig. 4. TRIM11 is a molecular chaperone and a disaggregase for tau, 
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QBI293/tau P301L-GFP cells stably expressing mCherry, or mCherry plus TRIM11, 
and treated without (G) or with [(F) to (H)] PFFs. SDD-AGE, semi-denaturating 
detergent agarose gel electrophoresis. (I to K) ThT binding (I), sedimentation (J), and 
EM [(K); scale bar, 500 nm] analyses of fibril formation by tau-441 (10 uM) in the 
presence of GST (1 uM) or GST-TRIM11 (0.25, 0.5, or 1 uM). (L and M) Formation of 
fibrils (L) and high-molecular-weight species (M) by tau-441 P301L (7.5 uM) incubated 
with GST (1 wM) or GST-TRIM11 (0.25, 0.5, or 1 uM). (N to P) ThT-binding (N), 
sedimentation (O), and EM [(P), scale bar, 200 nm] analyses of tau-441 fibrils 

(1 uM based on monomers) treated with GST (3.0 uM) or GST-TRIM11 (0.75, 1.5, or 
3 uM). Data are shown as means + SD. n = 3 for (C), (E), and (N); n = 8 for (G). 
**P < 0.01, ***P < 0.001, unpaired Student's t test. 
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preformed fibrils (PFFs) of tau can seed the 
aggregation of intracellular soluble tau (45-48). 
To determine whether TRIM11 protects against 
PFF-seeded tau fibrillization, we used HEK293 
cells that stably express RD(LM)-YFP, in which 
the tau repeat domain (RD) harboring the pro- 
aggregation mutations P301L and V337M is 
fused to yellow fluorescence protein (YFP) (48). 
Treatment of the HEK293/RD(LM)-YFP cells 
with PFFs generated from recombinant tau 
protein induced the aggregation of intracellular 
RD(LM)-YFP, as shown by the appearance of 
inclusions in cells (Fig. 4, D and E) and an in- 
crease in insoluble species in cell lysates (fig. S9, 
land J). Forced expression of TRIM11 reduced 
PFF-induced RD(LM)-YFP inclusions and aggre- 
gates while having a minimal effect on levels of 
soluble RD(LM)-YFP (Fig. 4, E and F, and fig. S9, 
Tand J). 

In an alternative approach, we treated QBI293/ 
tau P301L-GFP cells with PFFs to seed the 
aggregation of intracellular tau P301L-GFP 
(Fig. 4, F to H). When introduced into these 
cells, TRIM11 strongly decreased both tau inclu- 
sions in cells (Fig. 4, F and G) and tau aggregates 
in cell lysates (Fig. 4H). Preceding the formation 
of fibrillar aggregates, tau, similar to other 
misfolding-prone proteins linked to neuro- 
degeneration (49), assembles into soluble oligo- 
meric species, which could be neurotoxic (50). 
Forced expression of TRIM11 also reduced the 
formation of tau oligomers, as assayed with 
the tau oligomer-specific antibody T22 (57) 
(Fig. 4H). Therefore, TRIM11 protects against 
PFF-seeded aggregation of intracellular tau 
into soluble oligomers and insoluble fibrils. 


TRIM11 is a molecular chaperone for tau 


The ability of TRIM11 to maintain tau solubili- 
ty even in the face of PFFs prompted us to 
investigate whether TRIM11 can function as a 
molecular chaperone, a disaggregase, or both 
for tau. In the presence of heparin, recombinant 
tau protein spontaneously produced amyloid 
fibrils (52). This was indicated by a thioflavin T 
(ThT)-binding assay (Fig. 41); Western and 
dot blots that detected pelletable SDS-soluble 
(PE) and SDS-resistant (SR) aggregates, respec- 
tively (Fig. 4J); and electron microscopy (EM) 
that directly visualized mature tau fibrils (Fig. 
4K). GST-TRIMI11, but not GST, purified from 
bacterium (fig. S6I) effectively prevented tau 
fibrillization, reducing it by ~30% at a 1:80 
molar ratio to tau and by ~60% at a 1:20 molar 
ratio (Fig. 4, I and J). EM analysis confirmed 
that TRIM11 blocked the formation of mature 
tau fibrils (Fig. 4K). Moreover, TRIM11 prevented 
the aggregation of tau P301L into amyloid fibrils 
and other high-molecular-weight species (Fig. 4, 
L and M). TRIMI11 also effectively blocked fib- 
rillization GST-tau (fig. S10, A to C). At a molar 
ratio of 1:20, TRIM11 almost completely blocked 
the formation of GST-tau fibrils (fig. S10A). 
These observations indicate that TRIM11 is a 
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potent molecular chaperone for tau, prevent- 
ing its misfolding and aggregation. Distinctive 
from canonical chaperones such as the HSP70 
and HSP90 systems, which are multicomponent 
machineries driven by energy derived from ATP 
hydrolysis (15), TRIM11 obviates tau aggregation 
on its own, in the absence of any other protein 
components or ATP (Fig. 4, I to M, and fig. S10, 
Ato C). 


TRIM11 is a disaggregase for tau 


Amyloid fibrils, which are composed of 8 
strands that stack perpendicularly to the fibril 
axis (cross-B structure), are highly organized, 
energetically stable structures (53, 54). Their 
dissolution usually requires the coordinated 
action of multiple factors in an ATP-dependent 
manner (39, 55). Bacterial, fungal, and plant 
cells contain a disaggregation system com- 
posed of HSP70, its co-chaperone HSP40, and 
the AAA*-ATPase HSP104 (56), but this system 
is absent in metazoans. Recent studies showed 
that mammalian HSP70/HSP40 and HSP110 
(an atypical HSP70 family) can work together 
as a disaggregase (57-59). However, this dis- 
aggregation system fragments amyloid fibrils, 
generating small seeds that promote, rather 
than suppress, the aggregation of tau and other 
misfolding-prone proteins, thus exacerbating 
their neurotoxicity (60, 61). 

When incubated with tau fibrils, TRIM11 was 
capable of dissolving these aggregates, reducing 
their binding to ThT (Fig. 4N) and converting 
most of them to a soluble state (Fig. 40). This 
activity was verified by EM analysis (Fig. 4P). 
Similarly, TRIM11 could dissolve GST-tau fibrils 
(fig. S10, D and E). Thus, TRIM11 is also a 
disaggregase for tau. As for its molecular 
chaperone activity, TRIM11 dissolved preex- 
isting tau fibrils on its own. Unlike the mam- 
malian HSP70/HSP40-HSP110 system, TRIMI1 
effectively abrogates the formation and seeding of 
tau aggregates in cells (Fig. 4, A to H, and fig. S9). 


TRIM11 protects primary neurons against 
tau aggregation 


In primary neurons derived from wild-type 
mice, as in neural cell lines, endogenous TRIM11 
and tau colocalized with each other (Fig. 5A and 
fig. S11, A and B) and also interacted with each 
other, as shown by PLA (fig. S11, C and D). Treat- 
ment with OA, which elevated TRIM11 levels 
(fig. S11, E and F), augmented the TRIM11-tau 
interaction (fig. S11, C and D). Moreover, TRIM11 
interacted with AT8-reactive p-tau in human 
brain tissues, and this interaction was much 
more pronounced in AD compared with control 
tissues (fig. S11, G and H). 

To assess the effect of TRIM11 down- 
regulation on tau aggregation, we generated 
several antisense oligonucleotides (ASOs) com- 
posed of 2'-deoxy-2’fluoro-D-arabinonucleic acid 
(FANA) (62) against Trim1I. These FANA-ASOs 
effectively entered cultured neurons and si- 
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lenced TRIM1/1 expression to different extents 
(fig. S12, A and B). We treated cortical neurons 
derived from PS19 transgenic mice that ex- 
press human tau P301S (63) with PFFs formed 
by myc-K18/P301L, a truncated tau protein 
consisting of the microtubule-binding domain 
with the P301L mutation (47). This led to the 
formation of neuritic thread-like inclusions 
reactive to AT8 and MCI (Fig. 5, B and C, and 
fig. S12, C and D), the latter recognizing a 
disease-specific conformation of tau. Silencing 
Trimll by FANA-ASO exacerbated tau aggre- 
gation, increasing AT8- and MC1-reactive tau 
by ~50 to 90% (Fig. 5, B and C, and fig. S12, 
Cand D). 

To evaluate the effect of TRIM11 up-regulation, 
we cloned TRIM1I and, as a control, GFP into the 
adeno-associated virus AAV9 vector, a serotype 
that effectively targets cells in the central nervous 
system (64) (fig. SIZE). When treated with myc- 
K18/P301L PFFs, AAV9-TRIM11-transduced | 
PS19 cortical neurons exhibited an ~60% re- 
duction in AT8- and MC1-reactive tau compared 
with AAV9-GFP-transduced neurons (Fig. 5, 
D and E, and fig. S11, F and G). Likewise, when 
hippocampal neurons derived from PS19 mice 
were treated with myc-K18/P301L PFFs, substan- 
tially less AT8- and MC1-reactive tau appeared in 
AAV9-TRIM11-tranduced neurons than in AAV9- 
GFP-transduced neurons (fig. S12, H and I). These 
results indicate that TRIM11 affords neurons 
protection against tau aggregation. 


TRIM11 maintains neuronal viability 
and connectivity 


Synaptic degeneration is a prominent feature 
of AD patients and mouse models, preceding 
neuronal loss and correlating strongly with 
cognitive decline (65). To evaluate the role of 
TRIM11 in synapse formation, we probed neu- 
rons with antibodies to the presynaptic marker 
synaptophysin (SYP) and postsynaptic density 
protein 95 (PSD95). Silencing TRIM11 in wild- 
type cortical neurons by ASOs led to a reduc- 
tion in SYP-positive puncta (by ~40%; Fig. 5, 
F and G) and PSD95-positive puncta (by ~30%; 
Fig. 5, H and J), as well as their juxtaposition 
(Fig. S12, J and K). TRIM11-depleted neurons 
also contained lower levels of neurofilament 
light chain (NFL) (~40% reduction), a compo- 
nent of the axonal scaffold, but not microtubule- 
associated protein 2 (MAP2), which stabilizes 
microtubules in dendrites (Fig. 5, J to L, and 
fig. S12L). Moreover, silencing TRIM11 with 
various ASOs reduced the viability of cortical 
neurons in a manner that was correlated with 
the effect of these ASOs on TRIM expression 
(Fig. 5M and fig. S12B). These results suggest 
that endogenous TRIM11 is a neuroprotective 
factor, and its down-regulation impairs neuro- 
nal connectivity and viability. 

Conversely, forced TRIM11 expression through 
AAV9-mediated transduction increased SYP- and 
PSD95-positive puncta by ~70% and ~50%, 
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Fig. 5. TRIM11 maintains neural integrity and connectivity. (A) Colocalization 
of endogenous TRIMI11 and tau in wild-type cortical neurons. Scale bar, 10 um. 
(B to E) Images [(B) and (D)] and quantification [(C) and (E)] of intracellular 
AT8-reactive tau aggregates in PS19 cortical neurons transduced with control or wild-type cortical neurons transduced with AAV9-GFP or AAV9-TRIMIL. (R to 
TRIM11 ASO [(B) and (C)] or with AAV9-GFP or AAV9-TRIM11 [(D) and (E)] and T) Images of NFL, MAP2, and GFP/TRIMI1 staining [(R); scale bar, 10 um] and 
treated with myc-K18/P301L PFFs. Scale bar, 10 um in (B) and 50 um in (D). quantifications of relatively NFL and MAP2 intensity normalized to neuronal cell 
(F to I) Images [(F) and (H); scale bar, 10 ym) and quantifications [(G) and (I)] number [(S) and (T); mean + SEM] in wild-type cortical neurons transduced 

of SYP- or PSD-reactive puncta in wild-type cortical neurons treated with control or — with AAV9-GFP or AAV9-TRIMI11. (U) Viability of wild-type cortical neurons 
TRIM11 ASO. (J to L) Images of NFL and MAP2 staining [(J); scale bar, 10 wm] transduced with AAV9-GFP or AAV9-TRIMI11 and treated with or without tau PFFs. 
and relative intensity of NFL staining (K) or length of MAP2-stained dendrites (L) | Data are shown as means + SEM; n = 6 for (M) and rn = 3 for all other panels. 
normalized to neuronal cell number in wild-type cortical neurons treated with *P < 0.05; **P < 0.01; ***P < 0.001; ns, not significant; unpaired Student's t test. 


control or TRIM11 ASO. (M) Viability of wild-type cortical neurons treated with 
control ASO or the indicated TRIM11 ASOs. (N to Q) Images [(N) and (P); scale bar, 
10 um] and quantifications [(0) and (Q)] of SYP- or PSD95-reactive puncta in 


Zhang et al., Science 381, eadd6696 (2023) 28 July 2023 8 of 19 


RESEARCH | RESEARCH ARTICLE 


respectively (Fig. 5, N to Q). TRIM11 also ele- 
vated the expression of NFL (by ~50%) but 
not that of MAP2 (Fig. 5, R to T). Moreover, 
TRIM1I conferred to neurons resistance to tau 
PFF-induced cytotoxicity (Fig. 5U). Therefore, 
TRIM11 up-regulation enhances neuronal integ- 
rity and synaptic formation. 


TRIM11 ameliorates tau pathology, 
neuroinflammation, and cognitive impairments 
in PS19 mice 


Next, we evaluated the protective effect of 
TRIMI1 in mouse models of tauopathy. We 
first used PS19 transgenic mice, a widely used 
tauopathy model that progressively accumu- 
lates tau inclusions, resembling human AD and 
other tauopathy patients (63). AAV9-TRIM11 or 
AAV9-GFP vector was delivered to the hippo- 
campus of these mice through bilateral stereo- 
taxic injection at 2.5 months of age, and brain 
pathology and animal behaviors were analyzed 
at ~10 months of age (Fig. 6A). As expected, 
hippocampi of AAV9-GFP-injected mice dis- 
played strong NFT-like tau inclusions (Fig. 6B 
and fig. S13A). By comparison, hippocampi of 
AAV9-TRIM11-injected mice contained substan- 
tially less tau pathology (~55% reduction) (Fig. 
6B and fig. S13A). This reduction was corrobo- 
rated by immunoblot analysis, which showed 
that TRIM11 strongly reduced both insoluble 
and soluble p-tau species (Fig. 6C and fig. $13, 
B to D). The inhibitory effect of TRIM11 was 
dose dependent, because hippocampi with higher 
TRIM1I expression contained fewer p-tau species 
(Fig. 6C). 

Astrogliosis parallels the distribution and 
density of NFTs in tauopathies (66, 67), and is 
an early pathological manifestation of PS19 mice 
(63). AAV9-GFP- injected brains displayed strong 
staining of glial fibrillary acidic protein (GFAP) 
(Fig. 6, D and E, and fig. S13E), indicative of 
astrocyte activation. By comparison, AAV9- 
TRIM11-injected brains exhibited an ~50% 
reduction in GFAP immunoreactivity, to a level 
that was observed in brains of age-matched 
wild-type littermates (Fig. 6, D and E, and 
fig. S13E). Activation of microglia was also de- 
tected in AAV9-GFP-injected brains, as shown 
by increased immunoreactivity to ionized 
calcium-binding adapter molecule 1 (Iba1) 
(Fig. 6, F and G, and fig. S13F). By contrast, 
AAV9-TRIM11-injected brains showed an 
~40% reduction of microgliosis, again to a level 
seen in wild-type brains (Fig. 6, F and G, and 
fig. S13F). 

Similar to human tauopathies, tau dysfunc- 
tion in PS19 mice is associated with neuronal 
loss, especially in hippocampi (63). Compared 
with the hippocampi of the GFP group of animals, 
hippocampi of the TRIM11 group of animals 
exhibited an ~50% increase in immunoreactivity 
for MAP2 (Fig. 6, H and I, and fig. S13G) and an 
~70% increase in immunoreactivity for NFL, to a 
level comparable to that in hippocampi of the 
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wild-type group of animals (Fig. 6, J and K, and 
fig. S13H). Moreover, hippocampi of AAV9- 
TRIM11-treated animals displayed an ~40% 
increase in the expression of NeuN (Fig. 6, 
Land M, and fig. S13]. These results indicate 
that TRIM11 obviates axonal and dendritic de- 
generation and neuronal loss in PS19 mice. 

AD and many other tauopathies are charac- 
terized by a progressive decline in cognitive 
function (4, 5). To evaluate learning and memory 
ability, we performed an object recognition 
test (ORT) in which the propensity of mice to 
examine a novel object was observed (68). 
The GFP group of mice showed no difference 
in the times interacting with familiar and novel 
objects (i.e., ~50% preference to each). By 
contrast, the TRIM11 group of mice exhibited 
an ~77% preference to the novel subject, simi- 
lar to wild-type mice (Fig. 6N), suggesting that 
TRIM11 averts the deterioration of long-term 
memory. Decline in motor strength is another 
tau-dependent defect in PS19 mice (69). In a 
wire hang test, the TRIM11 group of mice ex- 
hibited a substantially longer latency to fall 
(~75 s) compared with the GFP group of mice 
(~50 s), albeit they were still weaker than 
wild-type mice (Fig. 60). These results indicate 
that intracranial delivery of AAV9-TRIM11 
in PS19 mice prevents tau pathology, neuro- 
degeneration, and gliosis and improves cogni- 
tive and motor functions. 


TRIM11 prevents PFF-accelerated disease 
phenotypes in PS19 mice 


Inoculation of tau PFFs into the brains of PS19 
mice before the onset of tauopathy induces 
focal formation and widespread transmission 
of tau aggregates, accelerating neuroinflam- 
mation and cognitive and motor deterioration 
(47). To evaluate whether TRIM11 protects 
against PFF-accelerated disease phenotypes, 
we inoculated PFFs generated from recombi- 
nant myc-K18/P301L (47), together with either 
AAV9-TRIM11 or AAV9-GFP, unilaterally into 
the hippocampus of PS19 mice at 8 weeks of 
age and analyzed disease phenotypes at 12 weeks 
of age (Fig. 7A). When co-inoculated with AAV9- 
GFP, myc-K18/P301L PFFs induced a substantial 
number of tau aggregates in the ipsilateral 
hippocampus (Fig. 7B and fig. S14A). By com- 
parison, when co-inoculated with AAV9-TRIM11, 
myc-K18/P301L induced ~45% fewer tau aggre- 
gates (Fig. 7B and fig. S14A). Levels of abnormally 
phosphorylated tau were also noticeably declined 
in both soluble and insoluble fractions in AAV9- 
TRIM11-coinjected hippocampi (Fig. 7C and 
fig. S14, B to D). In the presence of AAV9-GFP, 
myc-K18/P301L PFFs elicited astrogliosis and 
microgliosis. In the presence of AAV9-TRIM11, 
however, the numbers of reactive astrocytes and 
microglia strongly decreased (Fig. 7, D and E, 
and fig. S14, E and F). 

In an ORT, the TRIM11 group of animals 
showed a preference to the novel subject that 
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was comparable to that of wild-type animals, 
whereas the GFP group of animals showed no 
such preference (Fig. 7F). Therefore, TRIMI1 
obviates the decline in long-term memory. To 
assess the effect of TRIM11 on short-term mem- 
ory, we measured spontaneous alternation be- 
havior in the Y-maze (70). The TRIMI11 group 
exhibited significantly more spontaneous alter- 
nations (~56%) than the GFP group (~35%), 
reaching a level shown by wild-type mice (Fig. 
7G). Thus, TRIM11 also prevents the decline in 
the hippocampus-dependent working memory. 

To evaluate motor function and anxiety-like 
phenotypes, we performed an open field test. 
Compared with the GFP group of mice, the 
TRIMI1 group of mice traveled a longer dis- 
tance (~36% increase) (Fig. 7, H and I) and 
stopped for a shorter period of time (~40% 
reduction) (Fig. 7J). These behaviors of TRIM11 
mice were again similar to those of wild-type 
mice (Fig. 7, H to J). These results indicate that 
TRIM11 inhibits the seeding and cell-to-cell 
transmission of tau aggregates in PS19 mice, 
preventing the activation of glial cells and 
the decline in cognitive and motor abilities. 


TRIM11 ameliorates tau pathology and 
cognitive defects in 3xTg-AD mice 


In addition to tau-containing intraneuronal 
NFTs, AD is characterized by extracellular 
plaques composed of AB peptide (4, 5). To 
evaluate the protective effect of TRIM11 on 
the combined pathological effects of tau and 
AB, we used a triple transgenic AD model 
(3xTg-AD), which expresses tau P301L as 
well as familial mutations in two AD-related 
proteins, the amyloid precursor protein (APP) 
K595N/M596L (the Swedish mutation) and 
the presenilin 1 (PS1) M146V (71). 3xTg-AD 
mice demonstrate both AB plaque and tau 
tangles, resembling human AD. We delivered 
AAV9-TRIM11 or AAV9-GFP to the hippocam- 
pus of 3xTg-AD mice at 12 months of age, 
when they already showed a substantial amount 
of tau pathology in this brain region, and per- 
formed pathology and behavior analysis 1 month 
later (Fig. 8A). Delivery of AAV9-TRIM11 led to a 
substantial reduction in tau pathology com- 
pared with that of AAV9-GFP, as shown by IHC 
analysis (~30% reduction) (Fig. 8B and fig. SI5A). 
This reduction was corroborated by Western 
blot analysis, which indicated an ~80 to 90% 
reduction in p-tau species reactive to AT8 and 
PHF-1 (Fig. 8C, and fig. S15, B to D). Levels of 
ATS8 staining in AAV9-TRIM11-injected brains 
at 13 months of age were lower than those in 
uninjected brains at 12 months of age (fig. SI5A), 
suggesting that TRIM11 cleared preexisting tau 
aggregates. GFAP and Ibal immunoreactivity 
was also markedly lower in the hippocampus 
of AAV9-TRIM11-injected animals compared 
with that in AAV9-GFP-injected animals, being 
reduced by ~30% and 60%, respectively (Fig. 8, D 
to G, and fig. S15, E and F). 
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In 3xTg-AD mice, cognitive impairment 
initially manifests at 4 months of age and 
progresses as tau tangles and Af plaques 
accumulate (77). Compared with AAV9-GFP- 
injected animals, AAV9-TRIM11-injected animals 
spent significantly more time exploring the 
novel object and less time on the familiar ob- 
ject in the ORT (Fig. 8H). They also spent more 
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NeuN 


time exploring different arms in the Y-maze test 
(Fig. 81). Moreover, the TRIM11 group of ani- 
mals exhibited higher locomotive activity in 
an open-field test, traveling longer distances 
and stopping for less time (Fig. 8, J and K). 
Therefore, TRIM11 suppresses tau pathology 
and neuroinflammation and improves cogni- 
tive and motor ability of 3xTg-AD mice. 
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Effect of cerebrospinal fluid delivery of TRIM11 
The animal experiments described above not 
only demonstrated the role of TRIM11 in pro- 
tecting against tau-dependent neurodegenera- 
tion in mammalian brains, but also suggested 
a potential utility of the TRIM1I gene in dis- 
ease treatment. However, the intraparenchymal 
infusion used in these experiments was spatially 
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Fig. 7. TRIM11 ameliorates PFF-accelerated tau pathology and cognitive and behavioral impairments in PS19 mice. (A) Schematic representation of the 
study. (B, D, and E) IHC staining with AT8 (B) or anti-GFAP (D) or anti-lbal (F) antibody of the hippocampus (left) and quantification of AT8 (B), GFAP (D), or Ibal 
(F) immunoreactivity (right; mean + SD, n = 5 to 6 mice). (C) Immunoblot of tau and p-tau species in the hippocampus. Each lane represents a single mouse. 

(F) Preference for the novel subject in ORT (mean + SD, n = 7 or 9 mice). (G to J) Alternation in Y-maze (G) and travel distance [(H) and (|)] and freezing time (J) 
in the open-field maze. Data are shown as means + SD (n = 9 or 10 mice). *P < 0.05; **P < 0.01; ***P < 0.005; ns, not significant; unpaired Student's t test. 


restricted, whereas AD and many other tauo- 
pathies affect multiple brain regions. Thus, 
we investigated the effect of administering 
AAV vectors globally in the brain through the 
cerebrospinal fluid (CSF). We delivered AAV9- 
TRIM11 and AAV9-GFP by intracerebroventric- 
ular injection to 3xTg-AD mice at 9 months 
of age and analyzed mouse behaviors and patho- 
logy 1 to 4 months later (Fig. 8L). Compared 
with the GFP group of mice, the TRIM11 group 
of mice showed an ~35% reduction in tau- 
containing, NFT-like inclusions in the hippo- 
campus (Fig. 8M) and an ~50 to 70% reduction 
in soluble and insoluble p-tau species in cell 
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lysates (Fig. 8N and fig. S16, A to C). The 
TRIM11 group of mice also showed a sub- 
stantial reduction in GFAP and Ibal immuno- 
reactivity to levels that were close to, or only 
moderately higher than, those in the wild-type 
mice (Fig. 8, O and P, and fig. S16, D and E). 
Moreover, compared with the GFP group of 
animals, the TRIM11 group of animals performed 
significantly better in the ORT and Y-maze tests 
(Fig. 8, Q and R) and showed higher locomotive 
ability in the open-field test (Fig. 8S). These 
results suggest that CSF administration of the 
AAV9-TRIMII vector might be beneficial for 
treating tauopathies. 
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Outlook 

Our results show that TRIM11 plays a critical 
role in maintaining tau in its functional solu- 
ble state. TRIMI11 achieves this important out- 
come through multiple mechanisms. It promotes 
the proteasomal degradation of mutant and 
hyperphosphorylated tau as well as excess nor- 
mal tau, thereby acting as a crucial link be- 
tween tau and the proteasome. TRIM11 also 
serves as a molecular chaperone, preventing 
tau misfolding and aggregation. Moreover, 
TRIM11 functions as a disaggregase to dissolve 
preexisting tau deposits, including tau amyloid 
fibrils that are often intractable. Canonical PQC 
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Fig. 8. TRIM11 ameliorates tau pathology and cognitive deficits of 3xTg-AD 
mice. (A and L) Schematic representation of bilateral IP (A) and unilateral ICV 
(L) injection of 3xTg-AD mice. (B and M) AT8 staining of hippocampi from 
mice injected with the indicated AAV vector through IP (B) or ICV (M) (left; scale 
bar, 0.2 mm) and quantification of AT8 signal (right; means + SD, n = 6 mice). 
(C and N) Immunoblot of tau and p-tau species in the hippocampus of mice 
injected with indicated AAV vectors through IP (C) or ICV (L). Each lane 
represents a single mouse. (D to G, O, and P) Representative IHC images [(D) 
and (E); scale bar, 0.2 mm] and quantification of GFAP or Ibal immunoreactivity 
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systems such as HSP60, HSP70, and HSP90, 
which are conserved in both prokaryotes and 
eukaryotes, rely on coordinated actions of 
multiple components in an ATP-dependent 
manner. By contrast, TRIM11 can prevent or 
reverse tau aggregation by itself independently 
of ATP. These multiple and potent activities of 
TRIM11 in combination are highly effective in 
protecting against tau misfolding and aggrega- 
tion in various cell and animal models of tau- 
opathy. A survey of virtually all known human 
TRIMs shows that these activities of TRIM11 
are likely shared by at least some other mem- 
bers of this large family. TRIM proteins, which 
are only identified in metazoans, might have 
evolved later during evolution in part to main- 
tain the quality of the complex proteomes that 
enable these intricate life forms. 

The down-regulation of TRIM11 among spo- 
radic AD brains, combined with its potent pro- 
tective effect on tau, suggest that the diminished 
capacity of TRIM11 might contribute to tau 
aggregation and the associated pathological 
and cognitive changes. AD and other neuro- 
degenerative tauopathies are becoming increas- 
ingly prevalent as the population ages, yet they 
remain incurable. Tau, the abnormality of which 
is both more closely linked to disease progres- 
sion than that of AB and is required for the 
cytotoxicity of AB, represents a key target for 
AD in addition to primary tauopathies (72). In 
recent years, gene therapy has become an 
important approach to treating neurological 
diseases (73). AAVs can transduce the nondividing 
neurons and permit permanent expression of 
the therapeutic gene after a single administra- 
tion (73, 74), with positive clinical outcomes 
(75). The effect of intraparenchymal and intra- 
ventricular delivery of AAV9-TRIM11 in animal 
models provides a proof of concept for the 
potential utility of the TRIM7I gene to restore 
protein homeostasis in AD and other tauopa- 
thies, thus addressing the root cause of these 
devastating diseases. 


Materials and Methods 
Antibodies 


Antibodies against the following proteins or 
epitopes were purchased from the indicated 
sources: TRIM10 (pAb, Abcam, catalog no. 
ab151306), TRIM11 (pAb, Millipore, catalog 
no. ABC926; pAb, Proteintech, catalog no. 10851; 
pAb, Abcam, catalog no. ab111694), TRIM26 
(pAb, Santa Cruz, catalog no. sc-393832), TRIM36 
(pAb, Millipore, catalog no. SAB2106623), TRIM55 
(MUREF2) (mAb, Abnova, catalog no. H00084675- 
M02), GFP (mAb, Santa Cruz, catalog no. sc-9996; 
pAb, GeneTex, catalog no. GTX113617), hemagglu- 
tinin (HA) (mAb, C29F4, Cell Signaling, catalog no. 
3724), GAPDH (mAb, Santa Cruz, catalog no. sc- 
32233), B-actin (mAb, Sigma-Aldrich, catalog no. 
A5441), FLAG (mAb, Cell signaling, catalog no. 
14793; mAb, M2, Sigma-Aldrich, catalog no. F1804), 
Hsp90 (Cell Signaling, catalog no. 4874), phospho- 
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tau (Ser202/Thr205) (mAb AT8, Thermo Fisher 
Scientific, catalog no. MN1020), phospho-tau 
(Thr231) (mAb AT180, Thermo Fisher Scientific, 
catalog no. MN1040), phospho-tau (Ser262) 
(pAb, Invitrogen, catalog no. OPA1-03142), 
phospho-tau (Ser396) (pAb, Thermo Fisher 
Scientific, catalog no. 44-752G), tau (mAb tau5, 
Thermo Fisher Scientific, catalog no. AHB0042), 
tau (pAb T22, Millipore, catalog no. ABN454), p62 
(SQSTM1) (pAb, MBL, catalog no. PM045), 6xHis 
(pAb, Cell Signaling, catalog no. 2365), mCherry 
(mAb, Santa Cruz, catalog no. sc-390909), 
c-Myc (pAb, Santa Cruz, catalog no. sc-40), LC3B 
(D11, XP, mAb, Cell Signaling, catalog no. 3868), 
SUMO2/3 (pAb, Abcetpa, catalog no. AP1224<a), 
FLAG M2 agarose beads (Sigma-Aldrich, cata- 
log no. A1205), PSD95 (mAb 7E3, Cell Signaling, 
catalog no. 36233; mAb, Cell Signaling, catalog 
no. 3409), synaptophysin (mAb, Cell Signaling, 
catalog no. 36406S), NeuN (mAb, Millipore, 
catalog no. MAB377), and MAP2 (pAb, Origene 
Technologies, catalog no. TA309162). Anti- 
phospho-tau mAbs PHF-1 (Ser396/Ser404) and 
MCI were kindly provided by Dr. Peter Davies. 


Reagents 


The following reagents were purchased from 
Sigma-Aldrich (St. Louis, MO): Mg?*-ATP 
(catalog no. A9187), FLAG peptide (catalog 
no. F3290), NH,Cl (catalog no. 09718), Benzonase 
(catalog no. E1014), Isopropyl-1-thio-D-galactopyr- 
anoside (IPTG) (catalog no. 16758), Phenyl- 
methylsulfonyl fluoride (PMSF) (catalog no. 
P7626), Complete Protease Inhibitor Cocktail 
(catalog no. 11697498001), L-glutathione reduced 
(catalog no. G4251), cycloheximide (catalog no. 
66819), thioflavin T (catalog no. T3516), imidazole 
(catalog no. 15513), heparin (catalog no. H3393), 
Duolink In Situ Red starter Kit (catalog no. 
DUO92101), and Polybrene (catalog no. TR-1003). 
The following reagents were purchased from 
Thermo Fisher Scientific (Waltham, MA): Gluta- 
thione Superflow Agarose (catalog no. 25236), 
High-Capacity cDNA Reverse Transcription Kit 
(catalog no. 468814), puromycine dihydrochloride 
(catalog no. A1113803), Lipofectamine2000 (cata- 
log no. 6031), Lipofectamine RNAiMAX (catalog 
no. 13778150), TRIzol Reagent (catalog no. 15596), 
Hochst33342 (catalog no. H3570), SYBR Green 
Master Mix (catalog no. A25742), and SuperSignal 
Western Blot Substrate Bundle (catalog no. 
A45916). SUMO E1 (catalog no. E-315), SUMO 
E2 (UbcH9) (catalog no. E2-465), and 6xHis- 
SUMO2 (catalog no. UL-75) were purchased 
from Boston Biochem (Cambridge, MA); and 
4’,6-diamidino-2-phenylindole (DAPI) (catalog 
no. H-1200) and ABC (Avidin-Biotin Complex) 
Kits (catalog no. PK6100) from Vector Labora- 
tories (Newark, CA). Ni-NTA Agarose (catalog 
no. 30230) was purchased from Qiagen (Hilden, 
Germany), MG132 (catalog no. $2619) from 
Selleck Chemicals (Houston, TX), polyethyleni- 
mine (PEI) (Linear, MW 25000, catalog no. 
23966-1) from Polysciences (Warrington, PA), 
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Phosphatase inhibitor PhosSTOP (catalog no. 
04906845001) from Roche (Basel, Switzerland), 
DPX mounting medium (catalog no. 13510) 
from Electron Microscopy Science (Hatfield, 
PA), Bradford protein assay kit (catalog no. 
5000205) from Bio-Rad Labs (Hercules, CA), 
Tau-441 P301L (catalog no. T-1014-1) from 
rPeptide (Watkinsville, GA), and NeuN ready- 
to-use IHC Kit from Proteintech (catalog no. 
KHC0003). 


Plasmids 


For expression in mammalian cells, cDNAs for 
75 TRIMs, including 73 human TRIMs and 
mouse TRIM12 and TRIM30 (table S1), were 
synthesized and cloned into pCDH-EF1-FHC 
(Addgene, catalog no. 64874), in which each 
TRIM was fused with an FLAG tag and an 
HA tag at the C terminus (Gene Universal, 
Newark, DE). Tau-VN173 and TRIM11-VN173 
were cloned in pBiFC-VN173 (Addgene plas- | 
mid 22010), and tau-VC155 and TRIM11-VC155 
were cloned in pBiFC-VC155 (Addgene plas- 
mid 22011) (gifts from C.-D. Hu). Tau-GFP, tau 
P301L-GFP, and tau ATS (S199E, S202E, T205E)- 
GFP were made in pEGFP-NI, in which EGFP is 
fused to the C terminus of tau proteins. pRK5- 
EGFP-tau and pRK5-EGFP-tau P301L, in which 
EGFP is fused to the N terminus of tau proteins, 
were gifts of K. Ashe (Addgene plasmids 46904: 
and 46908, respectively). Flag-TRIM11, Flag- 
TRIM117™ (both in pcDNA3.1) (21), mCherry, 
and mCherry-TRIM11 (both in pTRPE) (79) were 
previously described. 

For expression in bacteria, GST-tau and GST- 
tau P301L were made in pGEX-1ZT, a derivative 
of pGEX-1AT with additional cloning sites. 
6xHis-GFP-tau and 6xHis-GFP-tau P301L were 
cloned in pET-28(+). GST-TRIM11 was cloned in 
pGEX-1ZT (21). 


siRNAs, single-guide RNAs, and antisense oligos 


siRNA targeting mouse TRIM10 (catalog no. sc- 
76733), Mouse TRIMI1I (catalog no. sc-76735), 
mouse TRIM36 (catalog no. sc-154647), mouse 
TRIM55 (catalog no. sc-149718), and human 
TRIM1I (catalog no. sc-76734) were purchased 
from Santa Cruz Biotechnology. Negative con- 
trol siRNA: 5'-GGUUAAUCGCGUAUAAUACG- 
CGUAU-3' was made by IDT. 

Single-guide RNAs (sgRNAs) targeting hu- 
man TRIM10, TRIM11, TRIM26, TRIM36 and 
TRIM55 were synthesized by Integrated DNA 
Technologies (Coralville, IA, USA). Targeting 
sequences are: TRIM10, 5'-GGCAGTTGACTT- 
CATCTGCC-3’; TRIMI1I1, 5'-GAGCCAGCGGCA- 
GAACGTGC-3'; TRIM26, 5'-GCCGCTCAATG- 
TTCTCCACC-3'; TRIM36, 5'’-TACCATTAA- 
GAATATCGAAA-3’'; TRIM55: 5'-AACCCGT- 
ATTTGC CCA CAAG. 

ASOs used in this study were designed and 
synthesized by AUM LifeTech (Philadelphia, 
PA, USA). ASO sequences are: TRIM11-1, 5’- 
ATAAACAGCAGCGACCCATCC-3’; TRIM11-2, 
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5'-ACTTAGTGCTTTGGTGAGAGC-3’; TRIM11-3, 
5'-ACTGTAGAATGAGAGATGGCC-3'; TRIM11-4, 

"-TAGAATGAGAGATGGCCAGCT-3'; TRIM11-5, 
5'-ATTTGTTTCCGTAGGTGCTCC-3’; scrambled 
control (SCR CTRL), 5'-CCTTCCCTGAAGGTT- 
CCTCC-3'. SCR CTRL-Far Red was tagged with 
a far red fluorescent dye with an excitation 
maximum at 646 nm and emission maximum 
at 669 nm (+5). 


Cell culture 


HEK293T and Neuro 2a (N2A) cells were pur- 
chased from ATCC, and SH-SY5Y cells from 
Sigma. HEK293 cells expressing RD(LM)-YFP 
cells, which express the tau RD (amino acids 
244 to 372 of the full-length tau 4R2N iso- 
form) harboring the pro-aggregation muta- 
tions P301L and V337M, were kindly provided 
by Dr. M. Diamond (48). QBI293/tau P301L-GFP 
cells were previously described (33). HEK293T, 
HEK293/RD(LM)-YFP, and QBI293/tau P301L- 
GFP cells were cultured in Dulbecco’s modi- 
fied Eagle’s medium (DMEM), SH-SY5Y cells 
in DMEM/F12(1:1), and N2A cells in Eagle’s 
minimum essential medium. All media con- 
tained penicillin/streptomycin and 10% fetal 
bovine serum. Cells were maintained at 37°C 
in a humidified incubator with 5% CO». 
Primary neurons were prepared from hip- 
pocampi or cerebral cortices of wild-type or 
PS19 mouse pups (posnatal day 1). Tissues 
were removed and put into ice-cold Hank’s 
balanced salt solution containing 10 mM HEPES, 
chopped into small strips, and digested by 
papain (1 mg/ml) for 30 min at 37°C. DMEM 
with 10% heat-inactivated fetal bovine serum 
was added to terminate the digestion reac- 
tion. After trituration, cells were harvested 
by centrifugation at 1000g and resuspended 
in Neurobasal medium containing 2% B27, 1% 
penicillin/streptomycin, and 2 mM GlutaMAX. 


TRIM knockout cells 


To knock out TRIM10, TRIM11, TRM26, TRIM36, 
or TRIM55 in HEK293T cells, control plenti- 
CRISPRv2 vector or plentiCRISPRv2 encod- 
ing TRIM10, TRIM11, TRIM26, TRIM36, or 
TRIM55 sgRNA was cotransfected in HEK293T 
cells with the packaging plasmids pMD2.G 
(Addgene, catalog no. 12259) and psPAX2 
(Addgene, catalog no. 12260) in a ratio of 4:1:3 
with polyethylenimine (PED. The medium was 
replaced 12 hours later. Viral particles were 
collected at 60 hours post-transfection and 
centrifuged at 1200 rpm for 5 min, filtered 
with 0.45 um sterile filter (Millipore), and con- 
centrated by Lenti-X concentrator (Takara Bio, 
catalog no. 631312) at a ratio of 3:1 and 4°C 
overnight, followed by centrifugation at 4500g 
and 4°C for 1 hour. HEK293T cells were then 
infected with the concentrated viral particles 
in medium containing 8 pg/ml polybrene. 
Lentiviral-transduced cells were selected with 
2 ug/ml puromycin for 7 days. 
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Mice 

Tau P301S (PS19) or B6;C3-Tg (Prnp-MAPT* 
P301S)PS19Vle/J) heterozygous male breeders 
(strain 008169), and 3xTg-AD or B6;129-Tg 
(APPSwe,tauP301L)ILfa Psent*™™?™/Mmjax) 
(MMREC strain 034830-JAX) homozygous male 
and female breeders were purchased from 
Jackson Laboratories. The PS19 colony was 
maintained through heterozygous breeding 
with C57Bl/6J wild-type male mice, and hetero- 
zygous PS19 transgenic mice were used in this 
study. The 3xTg-AD colonies were maintained 
through homozygous mating. Genotype was 
confirmed using polymerase chain reaction 
(PCR). The mice were housed in groups of 
three to five, with food and water available 
ad libitum and at a constant temperature of 
23°C in a 12-hour light/dark cycle. All proce- 
dures involving animals were approved by 
the institutional animal care and use commit- 
tee of the University of Pennsylvania. 


cDNA/SiRNA transfection and 
lentiviral transduction 


cDNA plasmids were transfected into cultured 
cells using Lipofectamine 2000 (Invitrogen) or 
PEI, and siRNAs were transfected using Lip- 
ofectamine RNAiMAX. When both siRNA and 
cDNA were used, cells were first transfected with 
siRNA for 24 hours and then with cDNA plasmid 
for another 24 hours. QBI293/tau P301L-GFP cells 
stably expressing mCherry or mCherry-TRIM11 
were generated by lentiviral transduction, using 
the third-generation lentiviral packaging system. 
HEK2938T cells were transfected with mCherry 
or mCherry-TRIMI11 plasmid, together with the 
helper plasmids Gag, Rev, and VSVG. Lentiviral 
vectors were obtained by centrifuging culture 
medium at 10,000 rpm for 18 to 20 hours and 
used to transduce QBI293/tau P301L-GFP cells 
(33). After 3 days of viral transductions, mCherry- 
positive cells were selected using fluorescence- 
activated cell sorting and grown in DMEM. 
These cells were further selected by two rounds 
of fluorescence-activated cell sorting and used 
in this study. 


Screening of TRIM proteins on tau 


HEK293T cells cultured in six-well plates for 
24 hours and at ~80 to 90% confluence were 
transfected with 0.5 ug of pRK5-GFP-tau P301L 
and 2 ug of the indicated pCDH-EF1-FHC- 
TRIM-FLAG-HA plasmids using PEI. Medium 
was changed 12 hours later. Cells were har- 
vested 48 hours after transfection and lysed 
with 150 ul of ice-cold lysis buffer [50 mM 
Tris, pH 8.8, 100 mM NaCl, 5 mM MgClo, 0.5% 
NP-40, 1 mM dithiothreitol (DTT), 250 IU/ml 
benzonase, 1 mM PMSF, and 1x complete pro- 
tease inhibitor cocktail] for 30 min on ice. 
Lysates were centrifuged at 13,000 rpm and 
4°C for 15 min. NP-40-soluble supernatants 
were designated as the SN fraction. NP-40- 
insoluble pellets were washed once with 500 ul 
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of precooled phosphate-buffered saline (PBS) 
buffer and resuspended in 50 ul of ice-cold pellet 
buffer (20 mM Tris, pH 8.0, 15 mM MgCl, 1 mM 
DTT, 250 IU/ml benzonase, 1 mM PMSF, and 
1x complete protease inhibitor cocktail) on ice 
for 30 min, and then added to 25 ul of SDS- 
containing buffer (6% SDS, 20 mM Tris, pH 8.0, 
and 150 mM DTT). Samples were heated at 95°C 
for 5 to 10 min. Aggregated proteins in pellets 
that could be solubilized by SDS were desig- 
nated as the SDS-soluble pellet fraction (PE). 
Both the SN and PE fractions were boiled 
in SDS sample buffer (final concentration: 
62.5 mM Tris, pH 6.8, 2% SDS, 10% glycerol, 
100 mM DTT, 0.01% bromophenol blue), and 
resolved by SDS-PAGE. Proteins were trans- 
ferred to a nitrocellulose membrane. After 
blocking with 5% nonfat milk in Tris-buffered 
saline with Tween, the membrane was incu- 
bated with anti-GFP, anti-HA, and anti-HSP90 
antibodies, followed by horseradish peroxidase- 
conjugated secondary antibodies. Signals were 
detected using a SuperSignal Western Blot 
Substrates Bundle Kits in an ECL detection sys- 
tem (Bio-Rad Chemidoc Touch Imaging System 
Chemiluminescence/Fluorescence Detection). 
The sample loading and exposure time of the 
immunoblots were controlled so that all protein 
bands were detected within the linear range. 


Real-time quantitative PCR 


For real-time quantitative PCR (RT-qPCR), total 
RNA was extracted using TRIzol according to the 
manufacturer’s instructions. One microgram 
of RNA was reverse transcribed using the High- 
Capacity cDNA Reverse Transcription Kit. Gene 
expression was determined by SYBR Green-based 
RT-qPCR using an ABI ViiA 7 system (Applied 
Biosystems, Foster City, CA, USA) and nor- 
malized to GAPDH. The following forward (F) 
and reserve (R) primers were used (5’ to 3’, 
human sequences unless otherwise indicated): 
tau-F: GAGGCGGGAAGGTGCAGATAATTAATAA, 
tau-R: CTGGTTTATGATGGATGTTGOC; TRIMI0- 
F: CTGCCCCATCTGTCAGGGTA, TRIM10-R, 
GGTATCTCACAGTAGCGGGTAA; TRIM11-F: 
TACTGGGAGGTGGAGGTTGGG, TRIMII-R: 
GGATCTCGGGAAAGATGAATAGCA; TRIM26- 
F: T@CACTACTACTGTGAGGACG, TRIM26-R: 
TCCTTAGGGTACTCAGGTGGT; TRIM36-F: 
GAGCTGTTTACCCACCCATTG, TRIM36-R: 
CTGATCCCACATCGTTGAATGA; TRIM55-F: 
TTGTCAGCACAACCTGTGTAG, TRIM55-R: 
CCCATGTCTATCCAAAACCACTT; GAPDH-F: 
GCTAAGGCTGTGGGCAAGG, GAPDH-R: GGAG- 
GAGTGGGTGTCGCTG, mouse (m) TRIM11- 
F: GCTGGCAATGGGTTCTGGAT; mTRIM11-R: 
GCTTGACAGAGGTGAGAAGAGGG;mGAPDH- 
F: GGGCATCTTGGGCTACACTG; mGAPDH- 
R: CATGAGGTCCACCACCCTGT. 


Immunoblot of human brain samples 


Postmortem human brain tissues from 23 neu- 
ropathologically confirmed AD cases and 14 
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control individuals without a history of de- 
mentia or other neurologic disorders (Fig. 2A 
and table S2) were obtained with approval of 
the University of Pennsylvania institutional 
review board and after obtaining informed 
consent from patients or their families. Pro- 
teins were extracted from these human brain 
tissues as previously described (76). Briefly, 
gray matter from frontal cortices was homo- 
genized in a high-salt, sarkosyl-containing buf- 
fer (10 mM Tris-HCl, pH 7.4, 800 mM NaCl, 
1 mM EDTA, 2 mM DTT, protease inhibitor 
cocktail, 1 mM PMSF, PhosSTOP, 0.1% sarko- 
syl, and 10% sucrose) with homogenizer in 
nine volumes of buffer per gram tissue. After 
30 min on ice, lysates were centrifuged at 
10,000g and 4°C for 10 min. The supernatants 
were collected and protein concentrations were 
measured using Bradford assay (Bio-Rad Labs), 
boiled in SDS sample buffer, and then ana- 
lyzed by SDS-PAGE. Two different anti-TRIM11 
antibodies (Millipore, catalog no. ABC926 and 
Proteintech, catalog no. 10851) were used for the 
data shown in Figs. 2B and 4E, respectively. 


IHC and IF analyses of human brain tissues 


Human brain tissues were fixed in para- 
formaldehyde, paraffin embedded, cut into 
6-um-thick sections, deparaffinized in xylene, 
and rehydrated in ethanol (100 to 50%). For 
IHC, rabbit anti-TRIM11 antibodies were 
diluted 1:500 in 3% normal goat serum (NGS) 
in PBS and applied to tissue slides overnight to 
rehydrated tissue sections after antigen re- 
trieval at 4°C. After being washed five times 
with PBS, sections were incubated with biotin- 
conjugated secondary antibody, linked to avidin 
by incubating with the ABC kit for 1 hour, and 
color developed with 3,3'-diaminobenzidine 
(DAB) solution. For IF, tissue sections were 
incubated with rabbit anti-TRIM11 antibody, 
mouse anti-p-tau antibody AT8, and/or mouse 
anti-NeuN antibody (1:500 in PBS containing 
3% NGS) at 4°C overnight. After being washed 
five times with PBS, the sections were incubated 
with Alexa Fluor 488- and Alexa Fluor 555- 
conjugated secondary antibodies. Sections 
were then dehydrated, cleared in xylene, and 
mounted using DPX mounting medium (Elec- 
tron Microscopy Science) and analyzed using 
an inverted fluorescence microscope (Revolve, 
Echo Laboratories). 


Protein purification 


GST, GST-TRIM11, GST-tau, 6xHis-GFP-tau, 
and 6xHis-GFP-tau P301L were purified from 
bacteria. BL21 DE3 cells (Thermo Fisher 
Scientific, catalog no. C600003) containing 
the corresponding plasmids were grown at 
37°C to Agoo nm = 0.6 to 0.8, and induced for 
protein expression with 0.5 mM IPTG at 19 
to 20°C for 20 hours. Cells were collected and 
resuspended in a buffer containing 50 mM 
Tris-HCl, pH 7.4, 500 mM NaCl, 200 mM KCl, 
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10% glycerol (for GST or GST fusions) or a buffer 
containing 50 mM NaH»PO, and 300 mM NaCl 
(for 6xHis-GFP-tau and 6xHis-GFP-tau P301L), 
each supplemented with complete protease in- 
hibitor cocktail, 1 mM PMSF, 1 mM DTT, and 
1 mg/ml lysozyme. Cells were lysed by sonica- 
tion. Cell lysates were centrifuged at 13,000 rpm 
and 4°C for 30 min. 

For purification of GST and GST fusion pro- 
teins, supernatants were applied to a column 
(Qiagen, catalog no. 34694) packed with glu- 
tathione beads and incubated at 4°C for 2 hours. 
The column was washed extensively with 
washing buffer (50 mM Tris-Cl, pH 7.5, and 
150 mM NaCl). The bound proteins were 
eluted with elution buffer (50 mM Tris-Cl, 
pH 7.5, 150 mM NaCl, and 20 mM glutathi- 
one). Fractions were collected at 0.5 ml each, 
and fractions containing GST or GST fusions 
were concentrated and desalted with centrif- 
ugal filters (Millipore, catalog no. UFC800308). 
For purification of 6xHis-GFP-tau and 6xHis- 
GFP-tau P301L, supernatants were incubated 
with Ni-NTA agarose at 4°C for 2 hours. Beads 
were washed extensively with wash buffer 
(50 mM NaH2PO,, 300 mM NaCl, and 20 mM 
imidazole, pH 8.0). Fusion proteins were eluted 
by elution buffer (50 mM NaH,PO,, 300 mM 
NaCl, and 400 mM imidazole, pH 8.0), concen- 
trated, and desalted with centrifugal filters. Tau- 
441 was purified as previously described (77). 

Flag-TRIM11 and Flag-TRIM117™ were puri- 
fied from HEK293T cells as previously described 
(78, 79). Briefly, HEK293T cells transfected 
with TRIM11 plasmids were lysed in IP-lysis 
buffer (20 mM Tris-HCl at pH 7.4, 150 mM 
NaCl, 0.5% Triton X-100, 0.5% NP-40, and 10% 
glycerol) with sonication. Supernatants were 
incubated with anti-Flag M2 Affinity Gel at 
4°C for 4 hours to overnight. The gel was washed 
sequentially with lysis buffers containing ad- 
ditional 0, 0.25, 0.5, 1, 0.5, 0.25, and 0 M KCl, and 
then with Tris buffer or sodium phosphate buf- 
fer. Recombinant proteins were eluted with 
3xFLAG peptide at 4°C for 1 hour, concen- 
trated, and desalted with centrifugal filters. 


Protein turnover 


For the cycloheximide chase assay, HEK293T 
cells were cultured in 12-well plates for 24 hours 
and then transfected with indicated plasmids. 
Twenty-foud hours after transfection, cells 
were treated with cycloheximide (50 pg ml”) 
for different durations. Cells were harvested 
and SN and PE fractions were extracted as 
described above. To assay turnover of tau 
P301L-GFP in QBI293/tau P301L-GFP cells 
stably expressing mCherry or mCherry plus 
TRIM1I, cells were cultured in medium con- 
taining Dox to induce tau P301L-GFP expres- 
sion and then in medium without Dox, as 
previously described (33). Turnover of pre- 
existing tau P301L-GFP with time after Dox 
withdrawal was analyzed by Western blot. 
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BiFC assay 

HEK 293T cells were seeded into six-well plates 
and cultured for 24 hours, so that cells grew 
to a confluence of 80 to 90%. Cells were co- 
transfected with indicated plasmids. Twelve 
hours later, cells were changed with fresh me- 
dium and cultured for another 12 hours. Fluo- 
rescence was observed on Revolve Microscope 
Demo (Echo Laboratories). 


Coimmunoprecipitation and pulldown assays 


The coimmunoprecipitation assay was per- 
formed as previously described (21). Briefly, 
HEK293T cells were transfected with the in- 
dicated expression plasmids and, when indi- 
cated, treated with 100 nM OA for 2 hour. Cells 
were lysed in lysis buffer (50 mM Tris, pH 7.4, 
200 mM NaCl, 0.2% Triton X-100, 1 mM DTT, 
1mM PMSF, and 1x complete protease inhib- 
itor cocktail) on ice for 30 min. Cell lysates 
were centrifuged. Supernatants were collected | 
and incubated at 4°C with the indicated pri- 
mary antibody overnight and then with Protein 
A/G-agarose beads for 4 hours. After extensive 
washes, immunoprecipitates were resuspended 
in SDS sample buffer and boiled for 5 min. Im- 
munoprecipitates and whole-cell lysates were 
analyzed by SDS-PAGE and Western blot. For 
pulldown assay with purified recombinant pro- 
teins, 6xHis-GFP-tau or 6x His-GFP-tau P301L 
was incubated with GST or GST-TRIM11 immo- 
bilized on beads. After extensive washes, pull- 
down samples were analyzed along with inputs 
by Western blot and/or Coomassie blue staining. 


Colocalization assay 


To assay colocalization of endogenous TRIM11 
and tau, SH-SY5Y, N2A, and primary neuronal 
cells were fixed for 10 min in PBS containing 
4% paraformaldehyde and 4% sucrose at room 
temperature, permeabilized with 0.5% Triton 
X-100 for 5 min, and blocked for 30 min with 10% 
NGS in PBS. Cells were incubated with rabbit 
anti-TRIM11 antibody, mouse anti-tau antibody 
(for all cells), and chicken anti-MAP2 antibody 
(for neurons only) at 4°C overnight. After being 
washed three times with PBS, cells were incu- 
bated with Alexa Fluor 488-conjugated goat 
anti-rabbit, Alexa Fluor 555-conjugated goat 
anti-mouse, and Alexa Fluor 405-conjugated 
goat anti-chicken (for MAP2 staining) second- 
ary antibodies at room temperature for 1 hour. 
After being washed three times with PBS, cover- 
slips were mounted on glass slides, and fluo- 
rescence images were captured by a confocal 
microscopy. To quantify the colocalization of 
endogenous TRIM11 and tau, SH-SY5Y or N2A 
cells or the same lengths of neuronal dendrites 
were randomly selected, and colocalization was 
quantified with the Image J plugin JACoP. 


Proximity ligation assay 


PLA on endogenous TRIM11-tau interaction in 
SH-SY5Y, N2A, and primary cultured neurons 
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was performed using the Duolink In Situ Red 
Starter Kit from Sigma-Aldrich. Cells were 
fixed for 10 min in PBS containing 4% para- 
formaldehyde at room temperature, washed 
with PBS, permeabilized with 0.5% Triton 
X-100 for 5 min, and blocked with Duolink 
blocking solution at 37°C for 60 min. Cells were 
then incubated with rabbit anti-TRIM11 anti- 
body and mouse anti-tau antibody at 4°C over- 
night. Secondary antibodies conjugated with 
oligonucleotides were added to the reaction 
and incubated for 1 hour at 37°C. Ligation 
and amplification were performed by incubat- 
ing with the ligation solution at 37°C for 30 min 
and then with amplification buffer contain- 
ing polymerase at 37°C for 100 min. Slides 
were mounted with mounting medium con- 
taining DAPI, and the fluorescence images were 
captured by a confocal microscopy. As nega- 
tive controls, PLA was performed in absence of 
any primary antibodies or in the presence of 
only one primary antibody. 


SUMOylation assay 


For SUMOylation assay in cells, HEK 293T cells 
were transfected with the indicated plasmids for 
48 hours and treated with MG132 (10 mM) for 
6 hours. Cells were lysed in lysis buffer (60 mM 
Tris-HCl, pH 7.4, 150 mM NaCl, 0.5% Triton, 1 mM 
DTT, 1 mM PMSF, and 1x complete protease 
inhibitor cocktail) supplemented with 2% SDS 
and 50 mM DTT. Cell lysates were boiled at 95°C 
for 10 min, diluted 10-fold in lysis buffer con- 
taining no SDS, and incubated with anti-GFP 
antibody at 4°C overnight and with protein A/G 
agarose beads for an additional 2 hours [de- 
naturing immunoprecipitation (d-IP)]. d-IP sam- 
ples were washed extensively and analyzed 
along with cell lysates by Western blot with the 
indicated antibodies. 

For SUMOylation assay with purified recom- 
binant proteins, the reaction was performed at 
37°C for L5 hours in 20 ul of reaction buffer (60 mM 
Tris, pH 7.5, and 2.5 mM DTT) containing puri- 
fied GFP-tau, GFP-tau P301L, GST-tau, or GST-tau 
P301L (400 nM each), SUMO E1 (125 nM), SUMO 
E2 (1 uM), Flag-TRIMH or Flag: TRIM17™* (300 nM 
each), His-SUMO2 (25 1M), and 10 mM Mg”*-ATP. 
The reaction mixtures were stopped by adding 
20 ul of IP-lysis buffer containing 2% SDS and 
50 mM DTT and heating at 95°C for 10 min. 
Reaction mixtures were diluted into 1.5 ml of 
IP-lysis buffer without SDS. GFP-tau and GFP- 
tau P301L were immunoprecipitated by anti- 
GFP antibody (anti-GFP d-IP), and GST-tau 
or GST-tau P301L by anti-GST antibody (anti- 
GST d-IP). After extensive washing, d-IP sam- 
ples were analyzed along with the reaction 
mixtures by Western blot using the indicated 
antibodies. 


Prevention and reversal of tau fibrillation 


For prevention, purified tau-441 (10 11M), tau- 
441 P301L (7.5 uM), or GST-tau (10 uM) was in- 
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duced to form aggregation by heparin (30 uM) 
in a reaction buffer (20 mM Tris-HCl, pH 7.4, 
100 mM NaCl, 1 mM EDTA, and 1 mM DTT) in 
the presence of the indicated concentrations 
GST or GST-TRIM11 at 37°C for 24 hours. Tau 
fibrillization was analyzed by ThT binding, 
as previously described (80). Tau aggregation 
was also examined by sedimentation assay. 
After centrifugation at 13,000 rpm at 4°C for 
30 min, the pellet fraction was analyzed by 
Western blot to detect SDS-soluble (PE) amor- 
phous aggregates and by dot blot to detect SDS- 
resistant (SR) fibrillar aggregates that were 
larger than the pore size of the membrane 
(0.2 um) (78, 21, 79). For reversal, preformed 
tau filaments (1 1M based on monomers) were 
incubated with or without the indicated con- 
centrations of GST-TRIM11 or GST in a reac- 
tion solution (50 mM HEPES, pH 7.5, 50 mM 
KCl, 5 mM MgCl, and 1 mM DTT) at 37°C for 
24 to 48 hours. The reaction mixture was an- 
alyzed by ThT-binding and sedimentation as- 
says as described above. 


Negative-stain electron microscopy 


Samples of the prevention and reversal assays 
(5 ul) was applied to a thin carbon grid that 
was glow discharged for 2 min using a Pelco 
Easyglow instrument. Five microliters of fresh- 
ly made 2% uranyl acetate stain solution was 
applied and incubated with the sample for 
2 min on the grid. Excess sample and stain 
were blotted away with Whatman filter paper. 
The staining process was then repeated and 
the grid was dried until imaged. Transmis- 
sion electron micrographs were collected using 
Tecnai T12 transmission electron microscope 
at 100 KeV. The images were recorded on 
Gatan Oneview 4K x 4K camera. Each image 
was collected by exposing the sample for 4 s, 
and a total of 100 dose fractionated images 
were collected and combined into a single mi- 
crograph. The data was collected at -1.5 to 
2 um under focus at 30 to 40 K magnification. 


rAAV9-TRIMI11 and rAAV9-GFP vector production 


Human TRIM11 (with a C-terminal HA tag) 
and GFP were cloned into the plasmid pENN. 
AAV9.CB7.CI.WPRE.rBG, in which their ex- 
pression is driven by CB7, a chicken f-actin 
promoter with cytomegalovirus enhancer ele- 
ments. TRIM11 and GFP plasmid DNAs were 
prepared by using the endotoxin-free mega- 
prep kit (Qiagen) and characterized by struc- 
ture and sequence analysis. Plasmid preparation 
and packaging were performed by the Vector 
Core of the University of Pennsylvania’s Gene 
Therapy Program, as previously described (87). 


Transduction of neurons with AAV vectors 
and ASOs 


Primary neurons were allowed to grow for 
4 days before treatments. rAAV9-GFP was 
diluted to 5 x 10° GC/ml, and rAAV9-TRIMI1 
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was diluted to 1.9 x 10" GC/ml in Neurobasal 
medium for neuronal transduction. Expres- 
sion of GFP was detected by anti-GFP anti- 
body, and expression of human TRIM11, 
which was tagged with HA to be distinguished 
from mouse TRIM11, was detected by anti-HA 
antibody. To knock down TRIM11 in neurons, 
control ASO and TRIM11 ASO no. 5 or the 
indicated TRIM11 ASOs were added into me- 
dium at a final concentration of 10 uM, and 
knockdown effects were analyzed by Western 
blot 3 days later. For long-term TRIM11 knock- 
down, ASOs were added again at a later time 
point (see below). Fluorescence-labeled oligos 
were used to monitor cellular uptake. 


PFF-induced aggregation of endogenous tau 


Fibril transduction on HEK293T cells was per- 
formed as previously described (48). Briefly, 
cells were transfected with control or TRIM11 
plasmid and 1 day later with tau PFFs (which | 
were incubated with Lipofectamine-2000 in 
OptiMEM for 20 min and then added to cells at 
the final concentration of 400 nM). Eighteen 
hours later, cells were washed and analyzed 
with a fluorescence microscope. Cells were 
lysed in NP-40-containing buffer, and soluble 
and insoluble tau species were analyzed by 
SDS-PAGE as described above. Tau aggregates 
that were stable in 2% SDS were analyzed 
by Semi-Denaturating Detergent Agarose Gel 
Electrophoresis (SDD-AGE). In addition, tau 
oligomers were analyzed by dot blot using the 
anti-tau oligomer antibody T22. Fibril trans- 
duction of primary neurons was performed as 
previously described (82). Neurons were seeded 
on plates on day 1 and transduced with AAVs or 
treated with ASOs on day 4. Tau PFFs (which 
were diluted in PBS and sonicated with 60 pulses) 
were added to neurons at 1.5 .g per well on day 7. 
ASO transduction was performed again on 
neurons on day 14. Neurons were analyzed by 
IF on day 21. 


Analysis of cultured primary neurons 


Hippocampal neurons and cortical neurons 
were washed with PBS and fixed for 10 min 
in PBS containing 4% paraformaldehyde and 
4% sucrose at room temperature. After fixing, 
the cells were washed with PBS, permeabilized 
with 0.5% Triton X-100 for 5 min, and blocked 
for 30 min with 10% NGS in PBS. Cells then 
were incubated at 4°C overnight with primary 
antibodies against phospho-tau (AT8 and MC1, 
1:2,000), GFP (1:2,000), and HA (1:500), synap- 
tophysin (1:500), PSD95 (1:300), NFL (1:500), 
and/or MAP2 (1:2,000). After primary antibody 
incubation, cells were washed and incubated 
with Alexa Fluor 488-, Alexa Fluor 555-, or Alexa 
Fluor 405-conjugated secondary antibodies 
(Invitrogen, 1:1,000) in the dark for 90 min. 
Hoechst 44432 (Invitrogen, 1:10,000) was used 
to stain DNA. Coverslips were then mounted 
on glass slides, and fluorescence images were 
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captured by a confocal laser scanning micros- 
copy. AT8, MCI, and NFL signals were nor- 
malized on the basis of cell numbers. PSD95 
and synaptophysin signals were normalized 
on the basis of dendrite length. MAP2 staining 
was used for dendrite length quantification. 
For this, single neurons were randomly se- 
lected, and the total dendrite length of each 
neuron was traced and measured with the 
ImageJ plugin Neuron J. For measuring cell 
viability, neurons were treated with control or 
TRIM11 ASOs or transduced with AAV9-GFP 
or AAV9-TRIMI1I and treated with myc-K18/ 
P301L PFFs for 2 weeks. Cells were analyzed 
with the cell counting KIT-8 (Dojindo Labo- 
ratories, CK04) according to the manufac- 
turer’s instructions. 


Stereotaxic injection 


PS19 or 3xTg-AD mice of either sex were an- 
esthetized with a ketamine/xylazine mixture 
(100 mg/kg ketamine, 10 mg/kg xylazine, in- 
traperitoneally) and immobilized in a stereo- 
taxic frame (Angle II, Leica Biosystems). A 10-1 
Hamilton syringe was used for injection at 
predetermined coordinates. Mice were injected 
into hippocampus (in relation to bregma, AP 
-2.5mm; ML, +2.0 mm; and DV, -1.8 mm) or 
lateral ventricle (in relation to bregma, AP 
+0.3 mm; ML, -1.07 mm; and DV, -2.50 mm) 
with (i) rAAV9-TRIM11 or rAAV9-GFP (1 x 10" 
GCs for ICV injection and 2 x 10”° for hip- 
pocampus injection) or (ii) rAAV9-TRIM11 or 
rAAV9-GFP plus K18 tau PFFs (2 ug/l, 2.5 pl). 
Specifically, PS19 mice were injected in the 
hippocampus either at 2.5 months of age with 
rAAV9-GFP or rAAV9-TRIMI11 (which were eu- 
thanized at 10 months of age) or at 4 weeks of 
age with rAAV9-GFP or rAAV9-TRIM11 plus tau 
PFFs (euthanized at 12 weeks of age). 3xTg-AD 
mice were injected with rAAV9-GFP or rAAV9- 
TRIM11 either in the hippocampus at 12 months 
of age or in the lateral ventricle at 9 months of 
age (both euthanized at 13 months of age). All 
animals were given analgesics (bupivacaine at 
10 mg/kg) during surgery and monitored dur- 
ing and for 3 days after surgery. 


Open-field test 


The multiple unit open-field maze (SD Instru- 
ments) consisting of four activity chambers 
with each chamber measuring 50 cm (length) x 
50 cm (width) x 38 cm (height) made from 
white high density and nonporous plastic was 
used to analyze exploratory locomotor behav- 
ior. Mice were placed in the center of the 
chamber at the start of the test. After a 2-min 
accustomization period, mice were monitored 
for total distance traveled, total movement 
time, total freezing time, and time spent in 
each of the designated quadrants of the cham- 
ber for 10 min using a video camera coupled 
to automated tracking software (ANY-maze, 
SD Instruments). 
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ORT 

Mice were first habituated to the empty open- 
field maze (referenced above) for 5 min. 
Twenty-four hours later, mice were placed in 
the center of the chamber with two identical 
objects placed 5 cm away from the walls and 
allowed to explore these objects for 10 min. 
Twelve hours later, mice were placed in the 
center of the chamber with the same two 
identical objects and allowed to explore for 
10 min. Six hours later, mice were placed in 
the center of the chamber with one famil- 
iar object and one novel object and allowed 
to explore for 10 min. The mice were recorded 
and tracked using a video camera coupled to 
an automated tracking software (ANY-maze, 
SD Instruments). The preference index was 
calculated by dividing the time spent explor- 
ing the novel object by the sum of the time 
exploring the novel object and time explor- 
ing the familiar object and multiplying this 
quotient by 100 to get a percentage (68). 


Y-maze test 


To measure hippocampus-dependent memory, 
mice were tested for spontaneous alternation 
behavior in a Y-Maze (ANY-maze, SD instru- 
ments). Mice were placed in the center of a 
three-armed Y-maze and tracked for a period 
of 5 min. Spontaneous alternation behavior 
was scored as a proportion of alternations 
(entering an arm different than the previous 
two choices) to the total number of alterna- 
tion opportunities based off the following 
formula: spontaneous alternation % = no. of 
spontaneous alternations/(total numbers of 
arm entries - 2) x 100. 


Wire hang test 


To measure grip strength, mice were placed 
on a mesh wire and allowed to acclimatize 
for 30 s before the mesh was flipped. The 
wire was suspended 30 cm above an empty 
clean cage and the latency to fall was mea- 
sured. The maximum exploration time was 
3 min. The average latency to fall in three tests 
performed over 1 day was analyzed. 


Immunoblot of mouse brain tissues 


Hippocampi of euthanized mice were carefully 
dissected and stored at -80°C until use. Hip- 
pocampi were lysed in the lysis buffer (60 mM 
Tris, pH 8.8, 100 mM NaCl, 5 mM MgClo, 0.5% 
NP-40, 1 mM DTT, 250 IU/ml benzonase, 
1mM PMSF, and 1x complete protease inhib- 
itor cocktail) on ice for 30 min. Lysates were 
centrifuged at 13,000 rpm and 4°C for 15 min. 
The NP-40-soluble supernatants were collect- 
ed as the SN fraction. The NP-40-insoluble 
pellets (PE) were washed with PBS, resus- 
pended in the pellet buffer (20 mM Tris, pH 8.0, 
15 mM MgCl,, 1 mM DTT, 250 IU/ml ben- 
zonase, 1 mM PMSF, and 1x complete protease 
inhibitor cocktail) on ice for 30 min and boiled 
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in buffer containing 2% SDS. SN and PE frac- 
tions were resolved by 10% SDS-PAGE and 
analyzed by Western blot. 


IHC analysis of mouse brains 


Mice were transcardially perfused with PBS 
and 4% paraformaldehyde (PFA), after which 
the brains were removed, postfixed in PFA for 
48 hours, thoroughly rinsed in PBS, and then 
embedded in paraffin blocks from which 7-um- 
thick sections were cut for histological analysis. 
Slides were baked at 60°C for 30 min, followed 
by deparaffinization with xylene (two washes x 
5 min) and rehydration through ethanol gra- 
dient (1 min 100% EtOH, 1 min 95% EtOH, 
1 min 70% EtOH, and 1 min 50% EtOH). Anti- 
gen retrieval was done as previously described 
(83). Specifically, brain tissue sections were 
incubated in 10 mM sodium citrate buffer 
(PH to 8.5) for 30 min at 90°C. After antigen 
retrieval, sections were washed with tap water | 
for at least 5 min, washed with PBS, and per- 
meabilized with blocking buffer (10% goat 
serum, 2% bovine serum albumin, 0.1% Triton 
X-100, and 0.05% Tween 20 in PBS) at room 
temperature for 1 hour. Antibodies directed 
against ATS (1:200), GFAP (1:1,000), Ibal (1:1,000), 
MAP2 (1:500), and NFL (1:500) were incu- 
bated with the sections at 4°C overnight and 
processed using DAB staining as previously 
described (84). NeuN staining was performed 
using the NeuN ready-to-use IHC Kit (Pro- 
teintech) according the manufacturer’s in- 
structions. Sections were counterstained using 
hematoxylin, and then dehydrated, cleared in 
xylene and mounted on slides using DPX 
mounting medium (Electron Microscopy 
Science). Samples were visualized and photo- 
micrographs of sections were captured using 
an inverted fluorescence microscope (Revolve, 
Echo Laboratories). Staining was quantified 
using ImageJ. The hippocampus area was out- 
lined and isolated, and the colors were sepa- 
rated using Color Deconvolution H&E DAB. 
The threshold was set automatically, and the 
intensity of staining was measured. For AT8, 
NFL, NeuN, and MAP2, the mean of the stain- 
ing intensity was calculated and is presented 
as relative intensity. For GFAP and Ibal, the 
stained percentage area of the hippocampus 
was calculated. 


Software 


The software programs used in this study were 
as follows: GraphPad Prism 7 (https://www. 
graphpad.com/scientific-software/prism/), 
ImageJ (https://imagej.net/Welcome), and 
ZEN lite (https://www.zeiss.com/microscopy/ 
en/products/software/zeiss-zen-lite.html). 


Statistical analysis 


The number of mice used for each Western 
blot, IHC, and cognitive/behavioral analysis 
is indicated in the figure legends. Each of the 
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experiments was conducted at least three 
times or with at least three biological repli- 
cates. Data are presented as mean + SD or 
mean + SEM. Unless otherwise stated, a two- 
tailed Student’s ¢ test was used to evaluate 
the statistical significance in the mean val- 
ue between two populations (*P < 0.05; **P < 
0.01; ***P < 0.001). GraphPad Prism 7 was 
used to perform statistical analysis and to 
create figures. 
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Deploying synthetic coevolution and machine 
learning to engineer protein-protein interactions 


Aerin Yang, Kevin M. Jude, Ben Lai, Mason Minot, Anna M. Kocyla, Caleb R. Glassman, 
Daisuke Nishimiya, Yoon Seok Kim, Sai T. Reddy, Aly A. Khan, K. Christopher Garcia* 


INTRODUCTION: Protein-protein interactions 
mediate biological functions important for 
cell physiology. Interacting proteins coevolve 
over millennia through sampling of mutations, 
largely at the protein-protein interface, to achieve 
the “best fit” for the required function. Protein 
engineering methods can generate large libra- 
ries of amino acids in a protein binding site for 
screening against other proteins of fixed se- 
quence, mirroring one-half of the evolution 
process. However, it has been challenging to 
develop in vitro systems to coevolve two pro- 


teins against one another by using “library-on- 
library” approaches that can recover matched 
pairs of coevolved proteins. An efficient syn- 
thetic system for bidirectional, simultaneous 
protein-protein coevolution could serve as a plat- 
form to simulate natural coevolution. It could also 
be a way to engineer large numbers of protein- 
protein complexes with different recognition 
properties for biotechnology applications. 


RATIONALE: The crux of the problem for library- 
on-library selections is the loss of connectivity 


Depiction of proteins coevolving in vitro, through energetic connectivity (“lightning bolts”) of amino 
acids at their binding sites. The relationships between these coevolving amino acids enable computational 
prediction of protein complexes. 
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of discrete pairs of interacting proteins du ches 


the selection process. We developed an appre wale 


for efficient recovery of matched pairs from very 
large libraries of amino acids on both sides of a 
protein-protein interface. Our solution was to 
display the protein as a complex on the surface 
of yeast. We made libraries of amino acids with- 
in the interface of this protein complex repre- 
senting ~1 billion variants and recovered only 
the protein complexes. In this fashion, the yeast 
that we recovered contained the sequences of 
both mutant interacting proteins. 


RESULTS: Using this strategy, we created several 
types of coevolution libraries that showed that 
we could recover interacting pairs of thousands 
of interface mutants. The mutant complexes dis- 
played a vast diversity of specificities, orthogo- 
nalities, and affinities and revealed unanticipated 
ways that the interfaces structurally compen- 


sated to mutations and mediated specificity _ 


versus promiscuity. With such a large volume 
of data, we used systems and network-level anal- 
ysis of the binding interactions to map evolu- 
tionary pathways and the thermodynamic basis 
for interface evolution. 

We explored the potential of machine learn- 
ing to engineer previously unknown interfaces 
using our large collection of coevolved sequence 
pairs. Specifically, we investigated if embeddings 
from protein language models, pretrained on 
the evolutionary history of extant protein se- 
quences, could be used to model our coevolved 
protein-protein interfaces. Our objective was to 
make in silico predictions about mutations not 
present in our initial library, as well as com- 
plexes involving novel amino acids. Through a 
process known as “transfer learning,” we were 
able to predict and subsequently validate com- 
plexes with amino acid sequences that were 
not included in the original library. This method 
allowed us to increase the amino acid diversity 
of our libraries, surpassing the experimental 
limits of yeast display. 


CONCLUSION: The integration of a synthetic co- 
evolution platform with machine learning has 
allowed us to interrogate a protein-protein in- 
teraction with exceptional granularity, but also 
to use this information prospectively. Using this 
approach, it is possible to revisit basic principles 
of protein-protein binding with systems-level 
data depth. We expect the synergy between our 
experimental coevolution platform and com- 
putation to stimulate development of applica- 
tions in cell engineering such as orthogonal 
switches and AND gates. 


The list of author affiliations is available in the full article online. 
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Deploying synthetic coevolution and machine 
learning to engineer protein-protein interactions 


Aerin Yang’, Kevin M. Jude, Ben Lai*, Mason Minot*, Anna M. Kocyla’, Caleb R. Glassman’, 
Daisuke Nishimiya’, Yoon Seok Kim’, Sai T. Reddy‘, Aly A. Khan®°, K. Christopher Garcia’?>* 


Fine-tuning of protein-protein interactions occurs naturally through coevolution, but this process is 
difficult to recapitulate in the laboratory. We describe a platform for synthetic protein-protein 
coevolution that can isolate matched pairs of interacting muteins from complex libraries. This large 
dataset of coevolved complexes drove a systems-level analysis of molecular recognition between 

Z domain-affibody pairs spanning a wide range of structures, affinities, cross-reactivities, and 
orthogonalities, and captured a broad spectrum of coevolutionary networks. Furthermore, we harnessed 
pretrained protein language models to expand, in silico, the amino acid diversity of our coevolution 
screen, predicting remodeled interfaces beyond the reach of the experimental library. The integration 
of these approaches provides a means of simulating protein coevolution and generating protein 
complexes with diverse molecular recognition properties for biotechnology and synthetic biology. 


n evolutionary biology, the concept of co- 

evolution underscores the compensatory 

relationships between biological systems 

that occur as a result of evolutionary pres- 

sures. Coevolution refers to reciprocal changes 
that occur under selective pressures between 
pairs of biomolecules or living organisms to 
fine-tune functions. Charles Darwin introduced 
the concept of coevolution by observing the 
relationship between the length of insects’ 
proboscises and the size of orchids’ spurs, which 
led him to predict the evolutionary changes of 
insects that could suck from the deep spur of 
Darwin’s orchid (J). By analogy, interacting pro- 
teins often undergo coupled mutations within 
or proximal to their interfaces to maintain or 
refine their functional interactions (2-4). Phy- 
logenetic sequence information reveals that 
correlated mutations accumulate through nat- 
ural evolution, suggestive of compensatory 
changes occurring between interacting resi- 
dues (5, 6). 

Protein coevolution has been difficult to 
study experimentally in the laboratory with 
reconstituted systems. Although directed evo- 
lution through phage or yeast surface display 
has enabled efficient screening to discover bind- 
ers with improved affinity and specificity toward 
a fixed target protein (7), it has been more 
challenging to execute “library-on-library” selec- 
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tions to coevolve both sides of a protein-protein 
interface concurrently (8-10). In vitro, co-selection 
by mixing separate libraries is limited by the 
inability to isolate discrete coevolved pairs from 
complex mixtures, thereby losing connectivity 
between the sequences of members of inter- 
acting pairs (8). Coevolution studies that use 
in vivo functional selections such as bacterial 
in vivo screening (JI, 12) or yeast two-hybrid 
systems (13), as well as in vitro screening strat- 
egies including yeast mating systems (9, 10) or 
compartmentalized two-hybrid systems (J4), 
have been reported, but these systems are lim- 
ited by small library sizes resulting in acquisi- 
tion of sparse information rather than a broad 
evolutionary spectrum. 

An additional practical limitation to devel- 
oping a synthetic coevolution system is that 
the diversity of experimental combinatorial 
libraries is limited, which makes experimental 
exploration of the entire sequence space re- 
quired to fully sample a protein-protein inter- 
face impossible. However, recent advances in 
protein language models (15, 16) and transfer 
learning offer the possibility of employing trans- 
fer learning to “transfer” knowledge learned 
from a subset of combinations to predict the 
binding affinity of a larger set of amino acid 
combinations that have not been experimen- 
tally tested. This enables effective exploration 
of a much larger space of combinations and 
identification of those that perform the de- 
sired function, 

A high-throughput system for coevolving 
protein-protein interfaces could have prac- 
tical utility for protein engineering in bio- 
technology and serve as a powerful basic tool 
to interrogate fundamental properties of mo- 
lecular recognition. Here, we describe a strat- 
egy to achieve coevolution of protein-protein 


pairs by using a high-throughput screening 
platform for library-on-library-based directed 
evolution. We adopted the Z domain of staph- 
ylococcal protein A and its affibody-binder 
dimer complex as a model system (17). The 
large dataset of interacting mutant sequences 
was subjected to systems-level analysis of mo- 
lecular recognition. High-resolution crystal struc- 
tures of orthogonal mutant pairs elaborated 
compensatory changes in predicted covarying 
residues and structural adaptations. By track- 
ing the mutational trajectories of coevolved 
mutants, we observed continuous changes in 
connectivity and specificity between mutants. 
We show that the set of coevolved protein 
pairs can inform machine learning algorithms 
to predict new complexes with amino acid 
compositions not encoded within the exper- 
imental libraries. 


Results 
Design of interprotein coevolution and validation 
of selection strategy 


To develop a platform for protein-protein co- 
evolution using yeast surface display, we adapted 
the yeast display a-agglutinin system to display 
two different proteins expressed as a single 
chain connected by a flexible linker (Fig. 1A). A 
3C protease site (-LEVLFQGP-) was inserted 
within the linker to enable 3C protease cleav- 
age of the connected proteins. After proteolytic 
cleavage, the first protein and its associated 
c-Myc tag remain covalently attached to the 
yeast cell surface while the second protein 
and hemagglutinin (HA) tag are liberated. The 
surviving noncovalently connected interacting 
pairs, together with the associated yeast clones, 
can then be isolated with C-terminal HA-tag 
binding antibodies. The identities of both in- 
teracting proteins can then be determined by 
DNA sequencing of the enriched yeast clones. 
We wished to execute proof-of-concept ex- 
periments for this strategy using a simple sys- 
tem of small stable proteins, so we chose the 
complex [dissociation constant (Kp) = 10 nM] 
of Z domain and its affibody binder, ZpA963 
[Protein Data Bank (PDB): 2M5A]. This is a 
model system with an interface idealized through 
phage display (78). We tested the cleavage-capture 
efficiency of three forms of fluorescently la- 
beled anti-HA-tag antibodies with different 
valency [Fab, immunoglobulin G monoclonal 
antibody (IgG mAb), Fab+streptavidin (SA) 
complex] to determine whether their fluores- 
cence was maintained after 3C protease cleav- 
age of the linker between two proteins (fig. 
S1A). We found that both bivalent IgG mAb- 
labeled cells and tetrameric complex (Fab+SA), 
but not monovalent Fab-labeled cells, main- 
tained their staining levels of 45.8 and 69.9% 
of uncleaved cells, respectively, after 3C cleav- 
age (Fig. 1A). We then chose six key residues 
forming the central hydrophobic portion of the 
interface on the basis of the nuclear magnetic 
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Fig. 1. Design and valida- 
tion of protein-protein 
coevolution strategy. 

(A) Schematic representa- 
tion of protein-protein 
coevolution workflow. The 
a-agglutinin yeast surface 
display system was used to 
display two proteins con- 
nected by a flexible linker. A 
3C protease site within the 
linker enabled cleavage, 

and the interacting proteins 
can be captured by 
C-terminally bound anti-HA 
antibody (red). POI, protein 
of interest. (B) Close-up 
view of key residues 

in the hydrophobic cavity 
of Z domain (green) and 
affibody ZpA963 (blue) 
(PDB: 2M5A). Encoded 
amino acids are used for 
two separate libraries, 

HL1 and HL2 (bottom). 

(C) On-yeast cleavage- 
capture assay of interacting 
pair (Z+ZpA963) and non- 
interacting pair (6xAla). 
Data are mean + SD; n = 3 independent replicates. (D) Correlation between 
on-yeast cleavage-capture assay and binding affinity of Z domain—affibody 
dimer mutants measured by SPR. The on-yeast cleavage-capture assay shows a 
strong semilog-linear relationship (R* = 0.8382) with binding affinity (pKp). 

(E) Histogram of the flow-cytometric analysis. HA-tag fluorescence in the library 
shows strong enrichment after MACS and FACS for HL1 and HL2 libraries. PM, post 
MACS; PF, post FACS. (F) Sequence-frequency logo of NGS data in the naive library 
and after final round of FACS. The original sequence (FLI+FIL) is derived from 

Z domain (A) and ZpA963 (B) dimer. The libraries converged back to the original 
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sequences either exactly or with minimal variations. The color scheme represents 
hydrophobic (black), polar (green), basic (blue), acidic (red), and neutral (purple) 
amino acids. (G) On-yeast cleavage-capture assay of the six most frequent mutants 
from HL1 and HL2 NGS data (sequence of each mutant: 1, FIL+FIL; 2, FLI+FIL; 3, FIl+FVL; 
4, FLI+FVL; 5, FLI+FIl; 6, FII+FIl). All six mutants show different levels of steady-state 
binding of HA-tag fluorescence during 3C protease cleavage. Data are mean + SD; 

n = 3 independent replicates. Single-letter abbreviations for the amino acid residues are 
as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, lle; K, Lys; L, 
Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. 


resonance structure of dimeric Z+ZpA963, 
accounting for 406 A? of the 1662-A” buried 
solvent-accessible surface area (BSA) on the 
two protein chains (Fig. 1B). When these six 
residues (F13, L17, and 131 in Z, and F17, 131, 
and L35 in ZpA963) were each mutated to 
alanine (6xAla), the antibody-stained yeast 
cells quickly lost their fluorescence to 0.63% 
within 10 min after 3C protease cleavage, 
whereas interacting pair displaying cells still 
retained fluorescence to 68.7% after 1 hour 
(Fig. 1C). We optimized different linker lengths 
[18, 22, and 26 amino acids] and various com- 
ponents (one copy or two copies of 3C protease 
site, HA tag in the linker or at C terminus) 
and magnetic-activated cell sorting (MACS) 
selection in the cleavage-capture assay (fig. S1, 
B to F). These general considerations and op- 
timization strategies can be applied to other 
protein-protein complexes that one wishes to 
implement into this coevolution platform. The 
on-yeast cleavage-capture assay is highly corre- 
lated with the dimer binding affinity, showing 


alog-linear relationship [coefficient of deter- 
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mination (R?) = 0.8382] at submicromolar 
affinity range (Fig. 1D). 

As an initial test, we asked whether the 
high-affinity Z-domain complex with ZpA963 
would converge back to its phage-display ideal- 
ized interface through coevolution (Fig. 1B). 
We generated libraries by randomizing the 
aforementioned six positions with two sets 
of degenerate codons: one set with only mini- 
mal hydrophobic amino acids (F, I, L, V, and 
M) and the other set with a more diverse set of 
amino acids (F, I, L, V, H, K, N, Q, Y, D, and E) 
(Fig. 1B). After each round of selection, library 
evolution was monitored by cleavage-capture 
assay and flow cytometry. After four or five 
rounds of positive MACS selections along with 
interspersed negative selections, both HL1 and 
HL2 libraries clearly enriched higher HA-tag 
fluorescence after 30 min of 3C protease cleav- 
age (Fig. 1E). We isolated cells displaying in- 
teracting pairs by fluorescence-activated cell 
sorting (FACS) from each round of MACS for 
further next-generation sequencing (NGS). The 
NGS results showed that the libraries con- 


verged to the original sequences exactly or 
with very few differences (Fig. 1F). Leu17 in 
Z domain (A) and Ile31 in ZpA963 (B) were 
replaceable with Ie or Val, whereas other sites 
strongly converged to the original amino acids 
(Fig. 1F). Each clone was assessed by the 
cleavage-capture assay and reached different 
levels of steady-state binding of HA-tag fluo- 
rescence during 3C protease cleavage (Fig. 1G). 
Using surface plasmon resonance (SPR), we 
measured binding affinities that ranged from 
7.9 to 34.1 nM, similar to the original template 
dimer affinity of 10 nM (fig. S2, A and B). These 
data suggest that the coevolution strategy was 
able to remodel the protein interface to its 
original “optimal” state from nonideal start- 
ing points represented in the complex libraries. 


Coevolution of a low-affinity dimer creates 
optimized new interfaces 


We next generated libraries at the interface of 
a weakly associating dimer (Z+Zspa.4) with 
micromolar affinity to determine if we could 
affinity-mature the interface by coevolution 
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(19, 20) (Fig. 2A). Consistent with its low af- 
finity, the Z+Zsp,., pair rapidly lost its HA-tag 
fluorescence within 15 min of 3C protease 
treatment (fig. S2C). On the basis of the crystal 
structure of the complex (PDB: 1LP1), nine 
interfacial positions located in a central hydro- 
phobic patch were selected for library ran- 
domization: five positions (Q9, F13, L17, 131, 
and K35) from Z domain and four positions 
(L9, V17, 131, and F32) from Zgp,.; domain. The 
first library, LL1, was designed to use minimal 
codon sets encoding both polar and hydro- 


A 


Library Chain Amino acids encoded 


phobic amino acids (F, L, I, K, H, N, Q, and Y) 
for five positions on the Z domain and hydro- 
phobic amino acids (F, L, I, V, and M) for four 
positions on the affibody Zsp,, (Fig. 2A). The 
second library, LL2, used a more diverse codon 
set encoding mixed amino acids (F, I, L, V, H, 
K, N, Q, Y, D, and E) for four randomized po- 
sitions on each of the Z domain and the affi- 
body, so that the functional diversity (1.91 x 
10°) of the yeast library almost reached the 
theoretical nucleotide diversity (4.29 x 10°). 
After six rounds of positive MACS and two 


B 


cMyc* cells 


Library Round 6 
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Round 7 
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rounds of FACS selections, more than 90% of 
the populations enriched into the upper-right 
quadrant of flow cytometry dot plots (Fig. 2B). 
NGS data collected at each step of selection 
clearly revealed the appearance of consensus 
sequences as the selection proceeded (Fig. 2C). 
On the basis of the sequencing data, we tested 
11 clones from LL1 and 22 clones from LL2 
using the cleavage-capture assay, and all 
reached varying levels of steady-state bind- 
ing during 1-hour 3C protease treatment (Fig. 
2D). In contrast to the result of coevolution 
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Fig. 2. Engineering remodeled dimer interfaces by coevolution. (A) (Top) 
Library positions on the interface from the complex of Z domain (green, chain A) 
and Zspa; (blue, chain B) (PDB: 1LP1). (Bottom) Encoded amino acids used for 
making two separate libraries, LL1 and LL2. (B) (Left) Flow cytometry dot plots 
showing enrichment of HA-tag fluorescence (red squares) in the library after rounds 
6 to 8. Antibody-labeled yeast cells were cleaved with 3C protease for 30 min. 
Cells were pregated on c-Myc+. (Right) Histograms showing elevation of HA-tag 
fluorescence during selection, from round 6 (green), to 7 (blue), and 8 (red). 
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(C) Sequence-frequency logo of NGS data in naive library, rounds 6, 7, and 8, 
revealing the appearance of consensus sequences as the selection proceeded in 
both LL1 and LL2 libraries. The original sequence (QFLIK+LVIF) is derived from Z 
domain (A) and Zspa (B) dimer. The color scheme represents hydrophobic 
(black), polar (green), basic (blue), acidic (red), and neutral (purple) amino acids. 
(D) On-yeast cleavage-capture assay of the mutants from LL1 (left) and LL2 (right) 
library. The altered positions compared with original amino acids are colored in 
red. Data are mean + SD; n = 3 independent replicates. 
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from the high-affinity pair, the enriched mu- 
tants from both LL1 and LL2 libraries have only 
a few conserved residues shared with the orig- 
inal template: On average, six to seven muta- 
tions were enriched (Fig. 2, C and D). The 
highest-affinity mutant from each library 
achieved approximately three-log enhanced 
affinity [Kp of 1.99 nM for LL1.cl (LIFFK+FILF) 
and 1.86 nM for LL2.c3 (LVLF+FIIV)] com- 
pared with the original dimer (Kp = 2.92 uM) 
(fig. S2, D to G). 


Synthetic coevolution yields pairs with different 
specificities and cross-reactivities 


To characterize the relationship between co- 
evolved protein sequences in our screen, we 
visualized the sequencing data as a network. 
We used statistical enrichment to identify the 
sequence pairs with the strongest likelihood 
of binding from the enriched library sequenc- 
ing data, on the basis of the overall count of the 
individual sequences in the screen. We used a 
hypergeometric test (materials and methods, 
Sequence library filter) to calculate this enrich- 
ment statistic, which compares the observed 
frequency of a particular protein pair in a 
screening library with the expected frequency 
of the pair on the basis of the overall count of 
the individual proteins in the library. If the ob- 
served frequency is significantly higher than 
the expected frequency, it suggests that the 
protein pair is enriched for interaction. We 
extracted sequences with a p value < 0.05 for 
further visualization and analysis of cross- 
reactivity and specificity. The enriched sequen- 
ces accurately predicted the binding specificity 
of each Z-A sequence, matching well with its 
actual binding specificity (Fig. 3A). 

The sequence similarity network (SSN) is an 
efficient way to observe relationships among 
large sets of evolutionarily related proteins 
(21). We constructed SSNs using the concaten- 
ated Z-A and Z-B full-length eight-amino acid 
sequences collected from all screening rounds 
of the LL2 library. The SSN revealed clear 
connectivity between sequences from later 
rounds (rounds 5 to 7) when an edit distance 
threshold of 2 was applied (Fig. 3B, left). This 
analysis validates that our coevolution platform 
progressively enriched communities of discrete 
recognition clusters. When sequences from 
round 7 were mapped with edit distance thresh- 
old 1, the sequences formed two large, dis- 
connected groups and several smaller clusters 
(Fig. 3B, right). Several notable Z-A sequences 
were colored in the SSN. This revealed that 
the nodes with the same Z-A but differing Z-B 
were closely connected in the same cluster, 
and closely related Z-A sequences that differ 
by one amino acid could be clustered either 
together (e.g., VFLV and IFLV) or separately 
(e.g., LVLV and LVLF). The specificity similar- 
ity network (SpSN) of Z-A sequences, which 
connects nodes when two Z-A sequences have 
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Fig. 3. Visualization and mapping of coevolutionary networks. (A) Sequence logo of Z-B sequences 
paired with each Z-A sequence from the statistically enriched NGS data (p value < 0.05) and actual 
binding specificity measured by on-yeast cleavage-capture assay, normalized to the highest affinity of < 
each Z-A sequence (below). Filtered sequences accurately predicted binding specificity, matching the 
actual binding specificity of each Z-A sequence. (B) SSNs of concatenated eight amino acid Z-A+Z-B 
library position sequences from all screening rounds (left) and round 7 (right) of LL2 library. Notable 

Z-A sequences are colored and specified in the panel (right). The edit distance threshold for connecting . 
nodes in the total library network is 2 and in the round 7 network is 1. The left SSN is colored by 
screening round and demonstrates connectivity among sequences from later screening rounds (rounds 5 
to 7). The right SSN is colored by Z-A sequence and provides a detailed view of the enriched stage 
(round 7), showing cluster formation based on Z-A specificities. (©) Circos cross-reactivity plot of 

100 sampled pairs from LL1 and LL2 (round 7 sequence data). The Circos plots illustrate the pairwise 
relationships between the 100 sampled pairs of Z-A and Z-B proteins. Each pair is normalized to 

have equal area, providing a visual representation of the approximate cross-reactivity of each sequence. 
(D) Single mutational pathway of mutants from the LL2 library connecting the original sequence 
(QFLI+LVIF) with the prominent LL2 library mutants. Mutated positions are color coded: red (one 
mutation), green (two mutations), and blue (three mutations). The number of mutations at each position 

is represented by a four-digit number next to each Z-A and Z-B sequence. (E) Plot illustrating the 
changes in AAG, AAH, and -ATAS for three mutants in the pathway (D) compared with the original 

pair (QFLI+LVIF). Mutations introduced in each step are highlighted in red. (F) Matrix to show binding 
specificity changes of the Z-A variants from the pathway. Binding affinities measured by on-yeast 
cleavage-capture assay were normalized on the basis of the highest affinity in each Z-A sequence. 

The single mutation introduced at each step is indicated in red. The highest affinity pair in each 

column is boxed in green. Control is a mutant with all library positions mutated to alanines. Data are 
mean of n = 3 independent replicates. 
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common Z-B partners, was illustrated, and the 
Z-A sequences that are clustered closely in the 
SSN were also closely connected in the SpSN 
(fig. S3). For example, VFLV, LVLV, and IFLV, 
clustered together in a big group in the SSN, 
are also closely connected in the SpSN, and 
LVLF is clustered separately in both the SSN 
and the SpSN (fig. $3). This implies that the 
SSN can capture the specificity of Z-A sequen- 
ces from our coevolutionary sequence data. 
The cluster graphs, which merge each clus- 
tered community into a single node, can ef- 
ficiently show such relationships between 
coevolved mutants and the structure of coevo- 
lutionary networks throughout the different 
screening rounds (fig. S4). Collectively, this 
network-level analysis reveals the extreme 
sensitivity of the specificity and cross-reactivity 
properties of Z-A and Z-B proteins to even 
single amino acid changes. 

In addition to the SSN, we used another 
visualization method to depict the cross- 
reactivity profiles of the NGS data in our co- 
evolutionary libraries. The Circos plot shows 
the pairwise relationships, highlighting the rela- 
tive cross-reactivity and orthogonality of the 
Z-A and Z-B proteins in both the LL1 and LL2 
libraries (Fig. 3C and figs. S5 and S6). We sam- 
pled 100 representative pairs to present in the 
plot, normalizing each pair to equal area to 
visualize the approximate cross-reactivity of 
each sequence. A series of Circos plots span- 
ning all screening rounds (naive, R2, R4, R5, 
R6, R7, and R8) reveals the progressive shifts 
in cross-reactivity during the selection process. 
For example, we observe the emergence of 
polyspecificity among certain dominant Z-A 
sequences and increased cross-reactivity be- 
tween sequences in later rounds of selection 
in both the LL1 and LL2 libraries (figs. S5 and 
S6). This result illustrates a broad range of spe- 
cificity and orthogonality within our library. 

We next attempted to track the mutational 
pathways of specific coevolved pairs to assess 
how these dimer interfaces were diversified 
along the course of coevolution. We generated 
single mutational evolutionary pathways con- 
necting the original sequence (QFLI+LVIF) 
with the prominent LL2 library mutants (Fig. 
3D). First, we traced the Z-A pathway from the 
cluster graphs to identify the connected inter- 
mediates starting from the original sequence 
(QFLI) to the late mutants (LVFF, IVFF) (fig. 
87). The connectivity between early mutants 
(QFLI-VFLI), mid mutants (VFLI-VFLV-VFLF- 
VVLF-LVLF), and late mutants (LVLF-LVFF- 
IVFF) can be visualized from cluster graphs at 
different screening rounds (fig. S7B). The abil- 
ity to trace mutational pathways suggests that 
this platform could be useful for simulating 
natural protein-protein evolution trajectories. 

To investigate the structural energetic mech- 
anism mediating the changes in specificity 
during coevolution, we measured thermody- 
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namic binding signatures by performing iso- 
thermal titration calorimetry (ITC) of several Z 
domain-affibody pairs along an evolutionary 
pathway (Fig. 3E and figs. S8A and S9). We see 
clear evidence for enthalpy-entropy compen- 
sation over the course of coevolution and a 
trend in which early strongly favorable enthalpy 
and unfavorable entropy transition to produce 
a less-favorable binding enthalpy compensated 
by a more neutral entropy (Fig. 3E and fig. S8). 
For example, we sampled representatives from 
the LL2 mutational pathway from the “founder” 
pair (QFLI+LVIF) to IVFF+FILV (Fig. 3, D and 
E). Although the overall free-energy landscape 
of this trajectory is flat, we see changes when 
examining the entropic and enthalpic terms. 
Binding of VFLV+IVVY and LVLF+FIIV is 
highly enthalpically favored and entropically 
disfavored, but by the end of the trajectory, we 
see a more moderate enthalpy of binding 
coupled with a moderately disfavored entropy 
in IVFF+FILV. Although we could not observe 
any structural features that distinguish cross- 
reactive versus selective complexes, the ther- 
modynamic properties of cross-reactive mutants 
(A-VFLV and A-LVLF) and a specific mutant 
(A-IVFF) differed. We also followed a single 
mutational three-step evolutionary trajectory 
from LILFK+FIVM to LIFFK+FILF, which are 
the two high-affinity orthogonal pairs from 
the LL1 library showing similar thermody- 
namic trends (fig. S8A). A considerable thermo- 
dynamic transition occurred when Leu17“ was 
mutated to Phe to produce a less-favorable bind- 
ing enthalpy compensated by a more neutral 
entropy in a specific mutant (LIFFK+FILF). 
Phenylalanine is often conserved in protein- 
protein binding sites, and aromatic residues 
frequently serve as anchor residues to mediate 
protein-protein interactions (22). The common 
mutation in both mutants (LVLF to IVFF and 
LILFK to LIFFK), Leut7“Phe, may act as a new 
anchor residue, thus leading to more entropically 
favored interactions between two proteins (23). 
We next verified by cleavage-capture assay 
the relative specificities of each Z-A sequence 
toward Z-B sequences from this evolutionary 
pathway (Fig. 3F). Starting from the early mu- 
tants, the specificity matrix clearly indicates 
gradual and continuous compensatory changes 
of binding preference between variants along 
the mutational pathway for both the LL1 and 
LL2 libraries (Fig. 3F and fig. S8B). Thus, we 
could systematically track the diversification of 
specificities and cross-reactivities within our 
library by mapping the coevolutionary network. 


Direct coupling analysis (DCA) and structural 
adaptations in coevolved complexes 


We sought to evaluate the accuracy of coevo- 
lutionary patterns between residues in predict- 
ing protein-interaction contacts (Fig. 4). The 
coevolution of residues in protein sequences is 
affected by epistatic couplings, which may or 


may not match with structural contacts (24). 
First, we used mutual information (MI), a mea- 
sure of the statistical coupling between any two 
positions in a protein pair, which can reflect 
structural interactions (25). To do this, we again 
filtered protein pairs to statistically enrich for 
those pairs occurring significantly more often 
than expected, mirroring our approach in the 
SSN analysis (Methods summary). Using these 
filtered pairs, we calculated pairwise MI be- 
tween all residues in the LL1 and LL2 screens 
(fig. S10). MI serves as a local information 
theoretical metric, enabling us to determine 
the level of dependence between two positions. 
Our results showed that the top-ranked infer- 
red coupling (174-318) was consistent with 
known contacts in the three-dimensional 
(3D) structure of the original pair, indicating 
that structural constraints are captured in the 
sequence coevolution. Next, we applied a DCA 
framework to the unfiltered LL2 sequences, . 
which constitute a larger and more complex 
library (25). Our goal was to determine if direct 
interactions could be inferred with the in- 
creased size and complexity of the LL2 library 
and a global statistical method. We used the 
inverse covariance matrix to infer direct con- 
tacts. The columns in the matrix represent 
residues from one protein, rows represent 
residues from another protein, and elements 
represent the statistical dependencies between 
residues (Fig. 4A). By analyzing the inverse 
covariance matrix, we identified 134-9 and 
17*-31® as strongly interacting pairs, which 
supports their direct contact with each other 
in 3D structures. The top five highly correlated 
residues were close in the original structure, 
but the overall relationship between inter- 
residue distance and DCA score was weak 
(Fig. 4B). 

To clarify these inter-residue covariations 
discovered from the sequence data, we deter- 
mined crystal structures of 10 coevolved pairs 
that spanned a range of cross-reactivities and 
orthogonalities (5 from LL1 and 5 from LL2). 
All structures were solved at high resolution 
(ranging from 1.00- to 1.92-A resolution) (figs. 
S11 to $13 and tables S1 and S2). From the 
structures, we could verify clear compensatory 
changes between the residues showing the 
strongest covariations (134-98 and 174-313) 
in both LL1 and LL2 library mutants. Phe13® 
of the Z domain, which is a core residue of 
the central hydrophobic patch in the original 
dimer, was mutated to the smaller He or Val in 
both LL1 and LL2 library mutants, and this 
was compensated by mutation of the oppos- 
ing residue, Leu9*, to the larger Phe (Fig. 4C). 
We also observed another highly correlated 
opposing residue pair (17°-31") mutated in 
a compensatory manner in both libraries. In 
this residue pair, Leu7“Phe is rotated outward, 
accommodating [le31®Leu to fill the cavity be- 
tween the two proteins (Fig. 4D). 
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Fig. 4. Coupling analysis and structural adapta- 
tion of coevolved variants. (A) DCA matrix 

to predict inter-residue covariation of LL2 library 
sequences (rounds 6 and 7). The DCA scores 

are normalized between 0 and 1. The pairs with 

the highest DCA scores, 13°-9° and 17°-31°, are 
marked with red squares. The matrix rows represent 
residues from Z-A, columns represent residues 
from Z-B, and the elements represent the statistical 
dependencies between residues. Through the inverse 
covariance matrix analysis, the pairs 134-98 and 
7*.318 were identified as strongly interacting pairs, 
indicating their direct contact in the 3D structures. 
(B) (Left) Inter-residue contacts and (right) the 
elationship between DCA and inter-residue distance 
measured from the original pair structure (PDB: 
LP1). The dashed lines are color coded (from purple 
to yellow) on the basis of the DCA matrix in (A). 
The top-two highest DCA contacts (Leu 17*-Ile 31°, 
Phe 13*-Leu 9°) are colored in red. The overall 
relationship between inter-residue distance and 
DCA score was weak (R* = 0.0203). (C to E) Close- 
up views of library positions to show local side-chain 
rearrangements. Pairs of residues at the center 

of the dimer interface were mutated in a compen- 
satory manner between 13° and 9° (C) and between 
17* and 318 (D). Side-chain substitutions from 

four different interacting pairs are shown as sticks. 
In (E), library positions 9° and 32® are closely 
associated with proximal residues Ginl0* and 
Trp35°, maintaining the shape complementarity 
between two proteins. (Bottom left) B chains of 
seven interacting pairs are aligned, with close-up 
views of the boxed region shown for each pair. 
Coupled side chains are shown as sticks with 
transparent spheres to indicate packing interactions. 


The extent of interface structural remodel- 
ing in all complexes resulting from the coevo- 
lution selection pressure is made clear in Fig. 4E, 
where the nonmutated residues Qio* and W35" 
accommodate the library mutations at posi- 
tions 9“ and 32° by adopting completely dif- 
ferent positions and local environments (Fig. 
4E). The two library positions (9“ and 32°) 
and proximal residues (Gin1o* and Trp35") kept 
close contact in all mutant structures, albeit 
with different interactions. The ability to rear- 
range at these positions allows decoupling of 
mutations despite close proximity (Fig. 4E). 
These results indicate that the protein interfaces 
of both specific and cross-reactive complexes 
were completely remodeled in different ways to 
improve affinities up to three logs (Kp of LL1.c2 = 
1.86 nM, original = 2.92 uM) and bias specificities 
[Kp of Z-AM (FILFK) with Z-BM“! (FIVM) = 
2.53 nM, and with Zgpa. (LVIF) = 21.9 uM]. 


Cross-reactivity and orthogonality in coevolved 
dimer structures 


The availability of a large panel of coevolved 
mutants allows us to ask questions about their 


Yang et al., Science 381, eadh1720 (2023) 28 July 2023 


A Direct Coupling Analysis B 
0 


B9 B17 B31 B32 


Position 13 (A) / Position 9 (B) 


LL LL2 
oA ~ 
F134 
1134 
(i134 
Original LL1.c2. LI LL1.c1 LL2.c3 LL2.c22 
E original pair Z/Zspq,, LL1.c2 
Loa 
ea Q104-— 
B VE. 
F32 A t 
LL2.c17 
we) 
U 
oe 
»EERLE 
- y328 ee 
CULL 
H3 


relative cross-reactivity versus specificity. For 
example, A-LILFK has more Z-B binding part- 
ners (B = 77) than A-LIFFK (B = 15) from the 
LL1 library, and A-LVLF (B = 53) and A-VFLV 
(B = 42) are also more cross-reactive than 
A-IVFF (B = 3) from LL2 library sequence 
data. To answer the question of what causes 
differences in the cross-reactivity of certain Z-A 
sequences and to clarify specificity-determining 
residues, we compared high-affinity mutant 
structures of each Z-A sequence (Fig. 5). First, 
the binding preferences of the two highest- 
affinity pairs from the LLI library, LL1.cl 
(LIFFK+FILF) and LL1.c2 (LILFK+FIVM), are 
nearly completely orthogonal, so we focused 
on investigating specificity-determining posi- 
tions of the two variants (Fig. 5, A and B). The 
two mutants differ by only three amino acid 
positions (positions 17°, 31°, and 32°) but have 
virtually no cross-reactivity with each other. 
Z-A sequences of LLI.cl and LL1.c2 bind to 
their own Z-B sequence with an affinity that 
is 500 times as high as when mixed with the 
other’s Z-B sequence (Fig. 5B). By contrast, 
B-FIVF, which has only a single amino acid 
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change from B-FILF or B-FIVM, has polyspeci- 
ficity and binds to both Z-A sequences (A-LIFFK 
and A-LILFK) with moderate affinity (Fig. 5B). 
Thus, the mutant LL1.c6 (LILFK+FIVF) can be 
a “bridging” intermediate to help explain the 
structural evolution of orthogonality through 
cross-reactivity. Comparing structures of LL1. 
c2 and LL1.c6 revealed that the single mutation 
Met32"Phe in LL1.c6 induced a noticeable geo- 
metric change by forming an enhanced hydro- 
phobic cluster within the binding interface (Leu9*, 
Leul34, Lys35“, Phe5®, Phe32®, and Trp35") 
(Fig. 5C). Because of the rotation of Trp35" and 
Trp35°-centered hydrophobic packing induced 
by Phe32®, the N-terminal end of helix 1 and 
the C-terminal end of helix 2 of Z-A tilted 17° 
closer to Z-B. Furthermore, the compensatory 
relationship between position 17“ and position 
31° is clearly revealed from the structures of 
LL1.cl and LL1.c2 (Fig. 4D). Taken together, 
the synergistic effects of the geometric change 
and compensatory mutations found from these 
three specificity-determining positions resulted 
in biased specificities of the two high-affinity 
variants evolved from the same library. 
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Fig. 5. Specificity determinants of orthogonal high-affinity mutants. (A) The altered positions compared 
with the original amino acids are colored red, and varying positions between mutants are highlighted with 
green boxes. (B) Table of affinity between Z-A and Z-B monomers measured by SPR. LL1.cl and c2 are 
orthogonal to each other, and B-FIVF of LL1.c6 is cross-reactive to both Z-A mutants. (€) (Bottom) 
Comparison of LL1.c2 and LL1.c6 structures near position 32° shows how the single mutation M32°F induces 
large conformational changes by side-chain rotation of Trp35° and increased hydrophobic interactions 
around it. (Top left) Superposition of overall structures of LL1.cl, LL1.c2, and LL1.c6. (Top right) Close-up 
views of each mutant show Trp35-centered hydrophobic interactions with surrounding residues. Position 
32 is highlighted with dashed circles. (D) Table showing amino acids in library positions of the three 
orthogonal LL2 mutants, LL2.cl7 (VFLV+IVVY), LL2.c7 (LVLF+FIVK), and LL2.c22 (IVFF+FILV), that were 
selected to compare differences in their affinity and structures. (E) Binding affinities of each combination 
of Z-A and Z-B mutants of the three mutants. (F) Significant structural difference at the interface of LL2.cl7 
and other two mutants. (Left) Superposition of overall structures. (Right) Close-up views of interface. 
LL2.cl7 has Phel3“ as the core of a central hydrophobic patch surrounded by multiple hydrogen bonds. LL2.c7 
and c22 have a Phe9®-centered hydrophobic patch composed of clustered x-r interactions and cation-r 


interactions (F31°, K35*, F9®, and W35°). 


The monomers from the high-affinity mu- 
tants from the LL2 library (three Z-A mutants 
and five Z-B mutants) are even more ortho- 
gonal to one another (fig. $2, F and G). The 
three orthogonal mutants, LL2.cl7 (VFLV+IVVY), 
LL2.c7 (LVLF+FIVK), and LL2.c22 (IVFF+FILV), 
were selected to compare differences in their 
affinity and structures (Fig. 5D). Each Z-A mu- 
tant binds to its binding partner with nano- 
molar affinity (3.98 to 44.2 nM) but has minimal 
cross-reactivity with other monomers (Fig. 5E). 
The backbone structures of the three mutants 
are relatively similar (Co root mean square 
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deviation of Z-A after aligning Z-B ranges from 
0.447 to 0.641 A). The dimer interactions of 
LL2.c17 have sharply diverged from the other 
two LL2 mutants, with the Phel3“-centered 
hydrophobic patch surrounded by multiple re- 
wired hydrogen bonds and an additional hy- 
drogen bond between Asnii“ and Phe32®Tyr, 
and o helix 2 of Z-A was slightly shifted to 
generate new interactions with helix 1 of Z-B, 
which explains the significantly improved af- 
finity between these two proteins compared 
with that of the original pair (Fig. 5F and fig. 
$14). The other two mutants, LL2.c7 and c22, 
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have clustered n-z and x-cation interactions 
at the interface (Phe31*, Lys35*, Phe9®, and 
Trp35®) (Fig. 5F). Additionally, the same com- 
pensatory mutations (positions 17“ and 31°) as 
those occurring in LL1 mutants are also seen 
from LL2.c7 and c22 mutants (Fig. 4D). In con- 
trast to the LL1 library mutants, the LL2 library 
mutants have less appreciable change in back- 
bone orientation, but interfaces are more diverse 
because of the broader amino acids available 
to be mutated in library positions. 

We do not observe systematic differences 
in the structural parameters of the interfaces 
mediating specific (A-LIFFK in LL1 and A-IVFF 
in LL2) versus cross-reactive (A-LILFK in LL1 
and A-VFLV, A-LVLF in LL2) complexes. All 
mutants except for one (LL2.c1) had a higher 
fraction of nonpolar BSA than that of the orig- 
inal dimer (58%), and all mutants except 
for one (LL2.c7) had a higher packing score 
(PackStat) than that of the original dimer | 
(PackStat of Z+Zspa.1 = 0.640) (table S3). The 
cross-reactive complexes did not show evi- 
dence of poorly packed interfaces and non- 
ideal bonding that might predispose them to 
promiscuity; in this sense they are indistin- 
guishable from protein interfaces of the spe- 
cific complexes. 


Using protein language models to predict 
dimer interactions from coevolved 
protein sequences 


The large database of coevolved complexes led 
us to ask if this information could be used to 
inform predictions through machine learning. 
One limitation of our experimental screen was 
that we used a limited set of amino acid co- 
dons in our experimental screen to fully sam- 
ple the diversity of the yeast display libraries. 
But this raised the question of how to predict 
the binding affinity of larger diversity libraries 
containing more diverse amino acids without 
exceeding the practical diversity limits of the 
screening platform. One solution is the use 
of protein language models, which are self- 
supervised machine learning models pretrained 
on large protein-sequence databases (15, 26-28). 
We used protein language models to expand 
the set of amino acids in our screen through 
the process of transfer learning (Fig. 6A). Trans- 
fer learning involves applying knowledge 
gained from one problem to solve a related 
problem. By using a common protein lan- 
guage model to embed pairs of protein se- 
quences, we can learn complex patterns that 
predict protein-protein interactions with a 
limited set of amino acids and then apply this 
knowledge to predict binding affinity for pre- 
viously undiscovered pairs using a broader set 
of amino acids. Our two coevolution libraries, 
LL1 and LL2, used different subsets of amino 
acids to mutate library positions and yielded 
differently enriched sequences after screening 
(Fig. 2). The LL2 library (11 amino acids) has 
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Fig. 6. Sequence-space expansion with protein 
language model. (A) Schematic representation 
of sequence-space expansion through protein 
language model. (B) Fraction of LL1-type 
sequences (Z-A and Z-B sequences can be 
encoded with LL1 degenerate codon sets) 

in LL2 sequencing data and vice versa. 

Fractions of each screening round (from 

naive to R8) are represented in a box plot with 
individual data points. A two-tailed Mann-Whitney 
test was used to analyze results; ***P < 0.001. 
(C) Schematic representation of our approach 

to predict dimer interactions with an expanded set 
of amino acids by means of outer product-based 
CNN. (D) Classification efficiency of LL1-trained 
model on LL2 test set. (Left) A violin plot 
representing predicted binding score of negative 
(n = 2771) and positive (n = 2794) data. Two- 
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dle) A ROC plot and (right) a precision-recall plot. 249, 7 4.0} | 44 SAA All 1 AA SAA All g 
The sequences in the test set were categorized ae a - 10 —% 3 
into five groups on the basis of B08 gos & 
the number of new amino acids compared with £06 4 30.6 = 
the LL1 sequence data, allowing an assessment 8 0.4 O4 ‘ 3 
of the impact of dissimilarity between the two 20.2 = AUC <0.88 02 = 
=0. 2) AP = 0.89 © 
libraries on predictions. The AUC and AP values of ~ 0.0 _ ion <0. 
: Rae 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 

total sequences and each subgroup are as follows: S BS False Positive Rate Recall 
all sequences (n = 5565, AUC = 0.88, AP = w ¥ 
0.89), 0 amino acids (n = 508, AUC = 0.88, F : 7 G -o- Predicted binding score 
AP = 0.98), 1 amino acid (n = 1332, AUC = 0.91, > igeern Re ore > BI HAag (% max) 
AP = 0.97), 2 amino acids (n = 1509, AUC = 9 seg (ee eX) ; 8 
0.84, AP = 0.87), 3 amino acids (n = 1091, e105 Fee foeS 1.08 4 
AUC = 0.80, AP = 0.70), 4 and more amino 5 xe 6 
acids (n = 1125, AUC = 0.73, AP = 0.32). The 3 05 oss EB 
diagonal dotted line in the ROC plot represents s & 2 
AUC = 0.5. (E) The predicted binding scores & 00+ Loot g 
of LL2 sequencing data of each screening round E = 2 
are represented in a violin plot. One-way analysis = NaiveR2 R4 R5 R6 R7 RB 
of variance; ***P < 0.001, ****P < 0.0001. ns, Screening rounds 
not significant (n = 28 to 10,000). (F) Correlation H I 
between predicted binding score of LL2 Rank Z-A Z-B Relative Sequence space 7 
sequencing data and actual percentage of binding %* 

i i 1 VVLF FIVY 45.2 
HA-tag MFI after protease cleavage. Normalized 2 VWLE FIIY 611 
percentage of HA-tag MFI and predicted binding 3 LVLF FIVY 78.4 
score of each round was compared by 4 VVLF KEIl n.d. AG 
Spearman's correlation test; r = 0.9643, P = 2 EE sir ne 
0.0028. Data are mean + SD; n = 3 independent 7 VVLF FIVF 88.9 

F : 8 LVLF LHII n.d. 
replicates for HA-tag MFI measurements. 9 LVLE EIIY 898 
(G) Correlation between predicted binding score and 10 VVLF LYHK n.d. cae , i ; 
relative affinity of the pairs from the mutational 11 FVLF FIll 17.6 Sroariiianial. Expandedgesnie sUnawiied 
pathway in Fig. 3D. Normalized percentage of max “Relative binding affinity measured by cleavage-capture LL1 seq. space LL2 seq. space landscape 


HA-tag MFI from cleavage-capture assay and 
predicted binding score of each round was com- 
pared by Spearman's correlation test; r = 0.5476, 


assay and normalized to the highest affinity pair (LVLF+FIIV) 


\ fa 7 


P = 0.0855. (H) Top 11 sequences by predicted binding score from LL2 NGS data. The binding of 6 out of the 11 sequences were verified by on-yeast cleavage-capture assay, 
and their relative binding affinities were normalized to the high-affinity LL2 pair, LL2.c3 (LVLF+FIIV). n.d. = not detectable affinity by the assay. (I) A cartoon representation 


depicting the expansion of sequence space from experim 


an expanded amino acid diversity compared 
with that of the LL1 library (8 amino acids for 
Z-A and 5 amino acids for Z-B), and only 3.2% 


sequences (compatible with LL1 degenerate 
codon sets), whereas LL1 data has 40% of LL2- 
type sequences on average (Fig. 6B). Therefore, 


of LL2 library sequencing data are LL1-type 
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ental LL1 data to the predicted LL2 sequence space by means of a protein language model and transfer learning. 


tems to test how the LL1-sequence-data-trained 
model can expand sequence space and predict 
previously unknown interactions that are only 
possible with sequences encoded by LL2. 
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To determine whether protein language 
models could be used to model Z-affibody pairs 
in our screens, we used the pretrained ESM 
protein language model (/6), which incorpo- 
rates knowledge of all amino acids from large 
evolutionary sequence datasets, to generate 
embeddings of the sequences observed in 
LL1 (Fig. 6C). We predict dimer interactions 
using the outer product of individual protein- 
sequence embeddings, and the resulting outer 
product matrix is then used as input into a 
convolutional neural network (CNN) for fur- 
ther processing. This approach allows us to 
model complex interactions of protein-protein 
interface sequences, including those featuring 
a more expanded set of amino acids than 
those used in our experimental screen. 

We first encoded the individual protein se- 
quences from each protein pair in the LL1 
screen and trained a deep neural network to 
identify positive interacting pairs. Positive 
interactions were defined as enriched or fil- 
tered protein pairs occurring significantly 
more often than expected in the enriched 
library (rounds 6 and 7), reflecting our meth- 
odology in the SSN and DCA analyses (Meth- 
ods summary). Negative interactions were 
defined as protein pairs that were present in 
the naive library NGS data but absent in 
rounds 6 and 7. Given that each round cap- 
tured 7 to 10 times more cells than the ob- 
served diversity of sequences in the naive 
library, we reason that the absence of these 
interactions is most likely due to being out- 
competed during coevolution. Next, we sought 
to evaluate the performance of the LL1-trained 
model in classifying held-out positive and 
negative LL2 interactions, which contain an 
extended amino acid library (Fig. 6D). The 
LL1-trained model could classify 5565 LL2 
sequences in our held-out test set (2794 posi- 
tive and 2771 negative) with an area under the 
curve (AUC) of 0.88 (Fig. 6D). We then spe- 
cifically examined the LL1-trained model’s abil- 
ity to generalize by assessing its capacity to 
handle a progressively expanding amino acid 
library. We binned the LL2 test data on the 
basis of the number of previously unused ami- 
no acids incorporated in the held-out test se- 
quences (0, 1, 2, 3, and 4 or more) relative to the 
LLI-sequence training data. These amino acids, 
which were not included in the LL1 training 
library, serve as indicators of the difference 
between the test and training sequences. De- 
spite a decrease in performance trend as more 
amino acids are introduced, the LLI-trained 
model still achieves an AUC of 0.8 and an aver- 
age precision (AP) of 0.7, even when up to three 
out of eight amino acids are not part of the LL1 
training library. These tests demonstrate the 
model’s robustness in handling sequence var- 
iations and making reliable predictions. 

Lastly, we applied our LLI-trained model to 
all LL2 screening rounds from naive to final 
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round 8 to assess the model’s ability to predict 
interactions of pairs in different selection stages 
(Fig. 6E). The predicted binding scores of each 
round increases as screening proceeds, and 
the mean of the predicted scores at each round 
is highly correlated with actual percentage of 
HA-tag fluorescence level after protease cleav- 
age (Fig. 6, E and F). We also compared the 
predicted scores with experimentally validated 
pairs in Fig. 3. The 28 validated interacting 
pairs (% of max HA tag > 1) in Fig. 3G showed 
elevated predicted binding scores (Fig. 6E). The 
model could even moderately predict affinity 
changes between pairs along the mutational 
pathway in Fig. 3E (Fig. 6G). Even though two 
intermediates (VFLF+ITVY and VVLF+FIITY) 
were predicted to have higher affinities than 
their actual affinities, overall trends are simi- 
lar between prediction and affinity throughout 
the pathway. We also evaluated the accuracy 
of our model in identifying hits among the 
top-ranked sequences. We conducted exper- 
imental validation on the binding of the 11 
highest-ranked sequences and found that 6 
out of the 11 sequences (hit rate = 54.5%) dem- 
onstrated affinities within the detectable 
range (submicromolar) as confirmed by the 
on-yeast cleavage-capture assay (Fig. 6H). These 
data demonstrate that we can use a protein 
language model to expand sequence space 
from the experimental sequence data of LL1 
and predict the new interactions that we ob- 
served in LL2 screening data (Fig. 61). 


Discussion 


We have developed a facile method for protein- 
protein coevolution as a solution to the problem 
of linking phenotype to genotype in large-scale 
library-on-library selections (29-31). The large 
collection of interacting Z domain-affibody 
pairs we generated enabled a systems-level 
structure-function analysis of molecular rec- 
ognition within this model system. We observed 
important characteristics of natural protein- 
level coevolution, including compensatory 
mutations between residues and hydropho- 
bic core repacking. Acquiring compensatory 
mutations between directly interacting pro- 
teins is the simplest molecular mechanism that 
can cause epistasis between two genes (32, 33). 
On the basis of DCA and high-resolution crystal 
structures, we could successfully infer epistatic 
interactions between Z domain-affibody dimer 
interfaces. The crystal structures of coevolved 
mutants revealed that when a key residue of 
the original central hydrophobic patch, Phe13“, 
is mutated to smaller amino acids such as leu- 
cine or valine, Phe9® or Trp35® newly form the 
core of the central hydrophobic patch, pre- 
sumably rearranging an existing hotspot or 
creating new ones. We infer that coevolving 
contact residues can fundamentally change 
binding interfaces to have different specific- 
ities and affinities by reinforcing or rearrang- 


ing hotspots. The remodeling of the dimer 
interface of the Z domain and affibody was 
similar to the repacking of the hydrophobic 
core of widely studied proteins such as Rop, 
T4 lysozyme, and A repressor-GCN4: leucine 
zipper fusions (34-38). Thus, interface coevo- 
lution appears to follow principles of protein 
core repacking (39-42). 

In the course of our experimental studies of 
coevolution, we identified a challenge in the 
form of a curse of dimensionality, in which an 
exponential increase in experimental data is 
needed to test protein interactions as the num- 
ber of mutating positions and amino acid 
alphabet increases. This issue is a major prac- 
tical limitation to protein engineering that uses 
combinatorial libraries because full-diversity 
libraries exceed the experimental diversity 
possible in yeast, phage, or ribosome display. 
To address this challenge, we used protein 
language models. Previously, protein language 
models have been limited to predicting mono- 
meric properties, and fine-grained variant ef- 
fect analysis of protein-protein interactions 
has been difficult to evaluate owing to a lack 
of data. Here, we demonstrate that by leverag- 
ing a shared sequence space learned from 
large-scale protein-sequence databases, we 
can both extract informative representations 
of protein sequences and model their bind- 
ing interactions. The amino acid composition 
of a protein encodes the information required 
to determine not just its structure but also its 
ability to negotiate interactions with other 
functional partners. Therefore, by using the 
information encoded in the latent protein- 
embedding space, we can explore a larger space 
of protein-protein interactions than is experi- 
mentally available. This approach, combined 
with transfer learning, can reduce data re- 
quirements and provide reliable predictions 
of binding interactions. 

This synthetic coevolution strategy can po- 
tentially be used in biotechnology applications. 
Although AlphaFold and RoseTTAFold are 
useful for predicting 3D protein structures 
from the amino acid sequence, predicting de 
novo protein-protein interactions remains a 
challenge (43). The experimental data gen- 
erated from our coevolution strategy can be 
used as training data for machine learning 
algorithms to expand sequence space much 
wider than what can be obtained experimen- 
tally and to predict protein-protein interac- 
tions. The one-pot production of a large set of 
protein pairs with different specificity and 
cross-reactivity is also useful for synthetic 
biology. Orthogonal interfaces are essential 
components to build reliable and predictable 
orthogonal gene circuits to avoid undesirable 
cross-talk with the host or other machinery 
(44, 45). Our synthetic coevolution strategy can 
generate user-designed orthogonal protein com- 
plexes for such applications. 
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Methods summary 

Protein expression and purification 

The DNA plasmids encoding for each affi- 
body were cloned into pET28, a bacterial ex- 
pression vector. The vector includes the affibody 
gene with either only C-terminal His,-tag 
or biotin-acceptor peptide tag (BAP tag, 
GLNDIFEAQKIEW) followed by Hisg,-tag 
between the NcoI and Xho! sites of pET28b 
(Novagen). To express affibody monomers, 
the vector was transformed into Escherichia 
coli BL21 (DE3), and the cells were grown 
at 37°C in TB medium supplemented with 
50 mg/liter kanamycin. When the optical den- 
sity at 600 nm (ODgoo) reached 0.6, 0.5 mM 
isopropyl-B-D-thiogalactoside (IPTG) was added 
to induce protein expression, and the cell 
culture was incubated overnight at 30°C 
before harvest. The proteins were purified by 
Ni?*-NTA agarose column chromatography 
(Ni-NTA, Qiagen) followed by size-exclusion 
chromatography (SEC) with a Superdex 75 In- 
crease 10/300 GL column (GE Healthcare). 
The proteins were stored in HEPES buffered 
saline (HBS, 20 mM HEPES pH 7.5, 150 mM 
sodium chloride). Affibody proteins used for 
SPR experiments were site-specifically bio- 
tinylated at the C-terminal BAP tag with BizA 
ligase and repurified by SEC. 


Yeast display of single-chain Z 
domain-affibody dimers 


Single-chain affibody dimers were displayed 
on the surface of yeast S. cerevisiae strain 
EBY100 (Invitrogen, cat. no. C839-00) by fu- 
sion to the C terminus of the Aga2 protein. 
Affibody dimers connected with a GS-linker 
and 3C protease cleavage site in the middle 
were inserted between an N-terminal cMyc 
epitope and a C-terminal HA tag. N-cMyc-ZA- 
linker-ZB-HA-C insert was cloned into the 
pCT302 vector (Addgene #41845). Competent 
yeast cells were electroporated with affibody 
plasmids and recovered in YPD (Sigma, cat. 
no. Y1375) at 30°C for 1 hour. Next, recovered 
cells were grown in synthetic dextrose plus 
casein amino acids (SDCAA) media (pH 4.5, 
20 g dextrose, 6.7 g yeast nitrogen base, 5 g 
bactocasamino acids, 10.4 g sodium citrate 
and 6.4 g citric acid monohydrate dissolved in 
1 liter of deionized H,O, supplemented with 10 ml 
of Gibco Penicillin-Streptomycin, 10,000 U/ml) 
to ODgoo 10, and the cultures were induced at 
20°C for 24 hours by diluting to ODgoo 1.0 in 
SGCAA (prepared as SDCAA, but use 20 g ga- 
lactose instead of dextrose) (7). The display level 
of proteins was confirmed by staining the cells 
with an Alexa Fluor 488-labeled anti-cMyc 
antibody (Cell Signaling Technology, cat. no. 
2279S) and Alexa Fluor 647-labeled anti-HA 
antibody (1:50 dilution; Cell Signaling Tech- 
nology, cat. no. 3444S), and fluorescence was 
monitored by flow cytometry (Beckman Coulter, 
CytoFLEX). 
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Yeast displayed libraries 

Details of library assembly, sequences, and se- 
lection protocols are provided in materials 
and methods. 


On-yeast cleavage-capture assay 


For single-clone cleavage-capture assay, colo- 
nies were picked from transformed EBY100 
cells plated on SDCAA plate. Additionally, 5 x 
10° induced yeast cells were stained with an 
Alexa Fluor 488-labeled anti-cMyc antibody 
and Alexa Fluor 647-labeled anti-HA anti- 
body (1:50 dilution). Antibody-stained cells 
were washed with MACS buffer (autoMACS 
Running Buffer, Miltenyi, cat. no. 130-091-221), 
then incubated in 20 ul 3C protease cleavage 
solution (lab-made 3C protease was diluted to 
0.4 mg/ml in MACS buffer) at 4°C. At each 
time point, 2 ul was sampled and diluted in 
ice-cold 100-u1 MACS buffer, and fluorescence 
was measured by flow cytometry. The mea- 
sured mean fluorescence intensity (MFI) was 
divided by MFI before cleavage to gain the 
percentage of maximum MFI to represent an 
affinity between two interacting proteins. 


Cross-reactivity Circos plots 


Circos plots were created with the circlize soft- 
ware package (46). In short, sequences with 
p value < 0.05 were combined into separate 
datasets for LL1 and LL2 and further sepa- 
rated by screening round. A cross-reactivity 
score was calculated for each distinct Z-A 
sequence by determining the number of its 
distinct Z-B pairs per dataset. Cross-reactivity 
scores were then normalized to sum to 1. To 
facilitate visualization with Circos plots, the 
dataset was then subsetted by using the “train_ 
test_split” function of the python scikit-learn 
(version 1.2.2) package. To maintain the pro- 
portion of Z-A cross-reactivity, the “stratify” op- 
tion was applied to the cross-reactivity score. 


SSN, cluster graph, and SpSN 


SSNs and cluster graphs were created with the 
igraph software package (47). Nodes of the edit 
distance-based networks correspond to dis- 
tinct Z-A and Z-B pairs. Connections are pres- 
ent between nodes for instances in which the 
edit distance of two Z-A and Z-B pairs is equal 
to or below a given threshold. Nodes of the SpSN 
correspond to distinct Z-A sequences, and con- 
nections are drawn between Z-A sequences 
when Z-A sequences share common Z-B sequen- 
ces numbering above a certain threshold. 


Mutual Information 


To measure the coevolution relationship among 

interface residues, we computed the MI between 

two nai i,j as MIy = al te B;) 
f (Ai, B;) ‘ 

log (ae) , following Gloor et al. (48), where 

J (A;, B;) is the observed frequency of the amino 

acid pair (A,B) at position i,j; f(A;) is the 


observed frequency of amino acid A at posi- 
tion 7; and f(B;) is the observed frequency of 
amino acid B at position 7. 


Inverse covariance matrix estimation 


To uncover direct coupling signals from the 
MSAs, we used a method based on the esti- 
mation of the inverse covariance matrix fol- 
lowing Jones et al. (25). For position 7,7 and 
amino acid pair A,B, we compute the empi- 
rical covariance matrix as S#? = f(A;,Bj)— 
J (A;)f(B;), where f(A;,B;) is the observed 
frequency of amino acid pair A, B at position 
i,j. f (Ai), f (B;) are the observed frequency of 
amino acid A at position 7 and the observed 
frequency of amino acid B at position j, re- 
spectively. Then we use the graphical lasso 
to estimate the inverse covariance matrix, 
0, by maximizing the objective function 


log{det(©)] — 888 subject to the con- 
straints). 93] < aandO = 0, whereS is the 


empirical covariance matrix, © is the inverse 
covariance matrix, and o. is the sparsity con- 
straint parameter. We set o = 1 in all of our 
analysis. The optimization is performed with 
CVXPY v1.2 package in python. 


Data 


To train our deep learning model, we as- 
sembled positive and negative protein-protein 
pair examples from the oligopeptide pair data- 
set from the LL1 library. For enriched samples, 
we filtered the intermediate enriched library 
and applied the hypergeometric test (mate- 
rials, Sequence library filter) with a 0.05 
p-value threshold, resulting in 14,491 pairs. 
For naive samples, we randomly sampled 
14,471 pairs from the naive library that were 
not present in the intermediate enriched lib- 
rary. We then randomly split the data into 
training and validation sets with 80 and 20%, 
respectively. For the LL2 library, we applied 
the same method, resulting in 2794 enriched 
and 2763 naive samples as our test set. We 
also normalized the sequencing counts for 
our training label such that all naive samples 
scored 0 and all positive pairs are scored ac- 
cording to their observed sequencing counts 
then min-max normalized as 


__ log[count(X) + 100] — log(2) 
log(maxcount) — log(2) 


Ss 


We added 100 base counts to all positive 
pairs to distinguish them from the naive pairs 
after normalization. 


Protein language model embeddings 


For each oligopeptide pair, we used the full 
chain sequence with the corresponding amino 
acid in the mutant position as sequence input 
to the protein language model for the latent 
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vector-representation generation (27). The vector 
representation is taken as the average position- 
wise embedding from the last layer of the 
protein language model with 1280 dimensions. 
For each pair, we generate the sequence em- 
beddings for each chain separately as V,, Vz, 
and the outer product is computed across the 
vector representation of the two respective 
chains as a 2D matrix representation for each 
oligopeptide pair as V,, = Vg ® Vj. 


Model architecture 


We designed and implemented a three-layer 
2D CNN model with kernel size (5,5) and 
channel size [64,128,256] followed by a two- 
layer fully connected (FC) network to predict 
the binding score of the input oligopeptide 
pairs. The model takes the 2D oligopeptide 
pair representation V,, as the input and out- 
puts a scalar, P,», as the binding score. We also 
apply a max pooling layer and instance norm 
in-between each CNN layer. 


WwW’ = InstanceNorm 
(ReLU{Maxpool[2Dconv(W! — 1)]}) 


where 
0 
w= ab 


Pa = FC{flatten(W*)] 


where W* is the output of the last CNN layer. 
We also applied sigmoid transformation to 
the FC network output for scaling. 


1 


Sigmoid(X) = f= 


Model training and testing 


All models are trained with squared L2 norm 
loss and the Adam optimizer with learning 
rate of le~* on a NVIDIA 2080Ti machine 
for 100 epochs with the best saved checkpoint. 
Our implementation uses the PyTorch V1.11 
compiled with CUDA 10.2. 


X-ray crystallography 


Details of crystallization and structure deter- 
mination are provided in materials and meth- 
ods along with structure statistics in tables S1 
and 82. 


Surface plasmon resonance 


Dissociation constants (Kp) of affibody dimers 
were acquired by SPR with the BIAcore T100 
instrument (GE Healthcare). Approximately 
100 resonance units (RU) of biotinylated af- 
fibody were captured on a SA sensor chip 
(Cytiva), including a reference channel with an 
unrelated protein. HBS-P+ (Cytiva) was used 
for all SPR runs. All measurements were made 
with twofold serial dilutions by using 60- to 
120-s association and 300- to 500-s dissoci- 
ation at a flow rate of 30 to 50 ul/min. Re- 
generation was performed with 0.02% SDS 
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or 0.1 M glycine, pH 2.5 after each analyte 
injection. The sensorgrams obtained were fit 
either to the 1:1 binding model or the steady- 
state affinity model with the BIAcore T100 
evaluation software. 


Isothermal titration calorimetry 


For isothermal titration calorimetry experi- 
ments, proteins were dialyzed overnight against 
HBS buffer. After dialysis, concentrations were 
measured with the BCA assay kit (Thermo 
Fisher Scientific). Titrations of all mutants 
were performed in a Microcal VP-ITC instru- 
ment at 298 K with Zsp,_; variants in the cell 
at 5 uM and Z variants in the syringe at 7 to 
10 times the cell concentration. The parent 
Zspa-1 protein was used in the cell at 50 uM, 
with the parent Z protein in the syringe at 
350 uM. Baseline subtraction was performed 
by titrating Z variants or Z parent into the 
dialysis buffer. All data were analyzed in Orig- 
in 7.0, fit to a one-site model by fitting AH, the 
association constant (K,), and the number of 
binding sites (7). 
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FRACTURE MECHANICS 
Tensile cracks can shatter classical speed limits 


Meng Wang, Songlin Shi, Jay Fineberg* 


Brittle materials fail by means of rapid cracks. Classical fracture mechanics describes the motion 

of tensile cracks that dissipate released elastic energy within a point-like zone at their tips. Within 
this framework, a “classical” tensile crack cannot exceed the Rayleigh wave speed, cp. Using brittle neo- 
hookean materials, we experimentally demonstrate the existence of “supershear” tensile cracks that 
exceed shear wave speeds, cr. Supershear cracks smoothly accelerate beyond cr, to speeds that 
could approach dilatation wave speeds. Supershear dynamics are governed by different principles than 
those guiding “classical” cracks; this fracture mode is excited at critical (material dependent) applied 
strains. This nonclassical mode of tensile fracture represents a fundamental shift in our understanding 


of the fracture process. 


ow do materials break? How fast can a 
crack propagate? These related and fun- 
damental questions are of essential in- 
terest to a broad range of scientific and 
engineering communities. In tensile 
fracture, the maximal velocity of a moving 
crack is classically considered to be (7) the 
Rayleigh wave speed, Cr, although there have 
been suggestions that cracks could surpass 


Fig. 1. Experimental setup and examples of 
subsonic and supershear cracks. (A) Cracks 
propagating along unweakened (left) and weakened 
(middle) materials driven through applied uniaxial 
stretch, 4 = (h + Ah)/h (Ah is the imposed constant 
displacement in y). Deformation fields are measured 
via distortions of 80-m grids, imprinted on one xy 
surface at z = 0. (Right) Grooves along y = O form 
straight weak layers by reducing material thicknesses 
from Wo to W. (B) Photographs of cracks propagating 
at v = 0.68cp along unweakened (left, A = 1.10) 

W = Wo and weakened (right, A = 1.07) W/Wp = 0.5 
materials demonstrate parabolic CTODs at the 
millimeter scale (dashed lines). Black regions along 
weak layers are caustics (diverging light paths) 
caused by the highly curved boundaries of weak 
layers. (©) Measured energy flux, G(v), for 

W/Wo = 0.5 (red) and W/Wo = 1 [blue; data 

taken from (6, 28)]. G(v) was measured via 

either the CTOD (open circles) or the J-integral 
calculated with measured deformation fields 

(filled circles). G(v) scales linearly with W/Wo. 

(D) Supershear cracks in samples without 

(left, A = 1.36) and with (middle, 4 = 1.34) a 

weak layer. (Right) Magnified views of areas 

denoted by dashed rectangles highlight 
discontinuous deformation fronts formed 

by shock waves emanating from crack tips 

in supershear (v > c;) propagation. 
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this speed (2, 3). In the vicinity of a crack’s 
tip, remotely applied stresses are amplified 
to an approximate singularity. Crack motion 
is guided by the principle of energy balance. 
This principle states that fracture takes place 
when the flux of stored potential energy flow- 
ing from large (system-size) scales to the crack’s 
tip balances the material’s fracture energy, IT, 
the energy dissipated at the tip, is defined as 


A 9G 0 
TTTTTTrTrrrrtrrT crrtrrtrrrrrrtTtyt 
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q 


the energy per unit crack advance require Chee 


separate the material. Provided that dis¢..— 
tion is confined to this point-like zone and 
crack instabilities are suppressed, fracture 
mechanics provide the fundamental theoret- 
ical framework to describe crack dynamics 
(1, 4, 5). Crack velocities, v, smoothly accelerate 
to cr while maintaining energy balance (6). 
Beyond cp, fracture mechanics predict that the 
energy flux into a crack’s tip will become 
negative, rendering v > cr to be unphysical. 
An exception to this speed limit could oc- 
cur in the case of cracks driven by shear load- 
ing (mode II) (7, 8). Analytical solutions of 
mode II cracks having a finite-sized dissipa- 
tive zone (9) predict positive energy flux into 
“supershear” cracks moving with speeds be- 
tween the shear wave speed, c,, and the di- 
latational wave speed, cp. Such rapid shear 
cracks are observed in interfacial and fric- 
tional failure (10, 17), as well as in earthquakes 
(12, 13). When sufficient elastic energy exists, 
shear cracks approaching cg may transition 
to supershear by giving birth to a daughter 
crack that moves faster than c, (14-16), thereby 
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extending the dissipative region from a point 
to a finite region in space. The motion of such 
supershear cracks can be approximately de- 
scribed through the principle of energy bal- 
ance (17). 

Cracks driven by the tensile loading (mode I) 
are commonly believed to be limited by cp (78). 
This limit, as well as the corresponding equa- 
tion of motion predicted by linear elastic frac- 
ture mechanics (LEFM), has been confirmed 
experimentally (6, 19, 20). Extensions of clas- 
sical fracture to hyperelastic materials predict 
that tensile cracks could surpass Cr (27). A qual- 
itatively different mode of supershear tensile 
fracture not incorporated in classical fracture 
mechanics has been predicted by lattice models 
to occur at high applied stretch (22-24). This 
theory predicts cracks that are able to exceed 
Cs and even Cp, regardless of material stiffening 
or softening (25). Moreover, these states are 


not expected to be governed by the principle of 
energy balance, the cornerstone of the classi- 
cal theory of fracture. In rubber, experiments 
have observed marginal supershear propaga- 
tion under extreme stretches (2, 3); however, 
both unequivocal experimental evidence and 
fundamental understanding of supershear ten- 
sile fracture are still lacking. 


Experimental system 


Our experiments were performed in thin, strip- 
like sheets of polyacrylamide hydrogels having 
varied heights, h, where x, y, Z are the propa- 
gation, loading, and sheet thickness directions 
(Fig. 1A). Polyacrylamide gels are brittle neo- 
hookean materials: linearly elastic at small 
stretches, A~1, and nonlinearly elastic as 
is increased. LEFM describes the motion of 
straight single cracks in these materials (6) for 
crack speeds v < Cr. The elastic nonlinearity 


pe UE hapa a ef ae 
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Fig. 2. Typical experiments with and without a weak layer. (A) Crack speed v, as a function of crack 
lengths | in x, for cracks propagating along weak layers under different applied stretches. Sample height h = 
20 mm and W/Wo = 0.5. The values of cp and cs are shown by the gray dashed and solid line, respectively. 
Colored dashed lines denote vz. (B) Photographs of supershear cracks (top) and the strain energy 

density (SED) fields (in J/m?) around the crack tip in the reference frame (bottom) at different values 

of va corresponding to the four supershear cases shown in (A). (€) Mach cones angles, o, measured from 
the SED fields in the reference frame, as a function of va/cs, with cs measured from the shear modulus. a 
of the four cases shown in (A) and (B) are denoted by the colors in (A). (D) Examples of dynamics of 
supershear oscillatory cracks. Orange markers show, as an example, the crack speed component in x, Vy 
for 2 = 1.26, attaining an asymptotic value highlighted by the orange dashed line. (E) Three snapshots 

of supershear oscillatory cracks, whose speeds are noted by the full markers in (D), at A = 1.50. 
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in the near-tip region (26, 27) causes oscilla- 
tory crack motion when v—>~0.9c;. In Fig. 1A, 
we experimentally show that crack oscillations 
and microbranching instabilities (4) can be 
suppressed by guiding cracks through a straight 
weak layer (at y = 0) that constrains crack 
motion to straight paths (materials and meth- 
ods). The deformation fields of the material 
surrounding crack tips, for both straight (weak 
layer) and oscillatory (no weak layer) cracks, 
were measured by imprinting an initially square 
grid mesh on one (zg = 0) of the sample’s xy 
faces. During crack propagation, we recorded 
the grid’s instantaneous deformation with a 
high-resolution fast camera (Fig. 1B). 

For subsonic crack propagation at small 
stretches, G(v), the instantaneous energy flux 
from the effectively two-dimensional medium 
into the crack tip (the “energy release rate”) 
was measured by using both the grid defor- 
mation and the crack tip’s opening displace- _ 
ment (6, 28). When a weak layer is used, the 
strain in the loading direction within the weak 
layer is larger than within the outside region 
by a factor of Wo/W, where W/Wo is the 
thickness ratio of the fractured region (fig. S1 
and supplementary text section 1). The strain 
energy driving fracture is, however, mainly 
supplied by the outer region. As a result, in 
samples with weak layers, G(v) is propor- 
tional to W/W (Fig. 1C); the effective energy 
dissipation of cracks within weak layers is 
T'(v) =T(v) W/W, where I'(v) is the fracture 
energy of the material. No additional effects 
of the weak layer on crack dynamics or struc- 
ture were observed. For low i, straight crack 
dynamics are well-described by LEFM (6), ac- 
celerating smoothly until cp; increasing i sim- 
ply increases acceleration toward cr. 


Observation of supershear cracks for 
large stretches 


At higher 4 (beyond ~1.2), crack dynamics 
change substantially; supershear cracks appear. 
Figure 1D presents typical examples of both 
supershear oscillatory cracks and straight super- 
shear cracks that are guided by weak layers 
(movies S1 to S3). Deformation fields sur- 
rounding supershear cracks radically differ 
from sub-Rayleigh (v < cp) cracks. Supershear 
cracks have wedge-like opening displacements 
(23, 29) and shock waves emanating from 
their tips. At a shock wave, deformation fields 
discontinuously jump. Behind these shocks, 
deformation fields are highly distorted, and 
the kinetic energy density (fig. S2) increases 
precipitously relative to the strain energy den- 
sity with v. Ahead of the shock, variations of 
the deformations, as well as the kinetic and 
strain energy densities, are much smaller. 
Supershear crack speeds, v, versus crack 
lengths, J, for various) are presented in Fig. 2A, 
for propagation along a weak layer. Cracks 
initiate at subsonic speeds and accelerate 
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Fig. 3. Strain field dependence of supershear cracks. (A and B) Measurement of 
the €,, strain component of supershear cracks propagating at (A) v; = 1.04c; 

(A = 1.22) and (B) va = 1.28c, (A = 1.56). Dashed lines denote angular locations 
of the strain measurements. r and 6 represent the distance to the crack tip and the 
angle relative to x, respectively. (©) Radial dependences of the strain fields for 

Va = 1.04c, ahead, ef, (left, inset) and behind e§, (right, inset) the Mach cone. 


smoothly until reaching 1-dependent asymp- 
totic speeds, va (movie S1). This continuous 
transition is very different from the transition 
to supershear of shear cracks, which both 
nucleate at finite distances ahead of sub- 
Rayleigh cracks, and the range cr < v < ¢, is 
forbidden (7-9). Figure 2B shows the grid de- 
formations and corresponding strain energy 
densities surrounding crack tips for 1.04c, < 
v <1.31cs. In the reference (material) frame, 
the shock waves generated form Mach cones 
angles, a, relative to x, where sin(a) = ¢s/Va 
(Fig. 2C). Shear wave speeds derived from the 
Mach cones, ¢, = 5.75 + 0.22 m/s, agree well 
with both values calculated from the shear 
modulus, 1, and directly measured values (mate- 
rials and methods). The speed of waves prop- 
agating in a stretched medium was determined 
to be accurately described by the neo-hookean 
constitutive law of the material in our experi- 
ments, with a constant value of c, (fig. S3 and 
supplementary text section 2). Without weak 
layers, supershear oscillatory cracks appear 
at the same A levels (Fig. 2D) as the straight 
supershear examples presented in Fig. 2A. 
Here, whereas the global speed, v, oscillates 
with increasing amplitudes, the velocity com- 
ponent in 2, vy, reaches a constant (asymptotic) 
speed, va = Uz, before the oscillation develops 
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(C). Colors are the same as 


(fig. S4 and supplementary text section 3). 
Oscillatory supershear cracks generate shock 
waves that, in contrast to Mach cones of straight 
cracks, have irregular shapes that vary in time 
(Fig. 2E) as they reflect the spatially oscillating 
crack tips. 

For supershear cracks, €,, is the strain 
component with the largest variations near 
the crack tip. In Fig. 3, A and B, we present 
measurements of €yy for va = 1.04c, and 1.28cs. 
We first consider the (P-wave governed) fields 
ahead of the Mach cones, which we denote as 
e}.,. For both velocities, Fig. 3, C and D (left), 
show that e>,, (7, 8) have very different radial 
dependences for each 0. When, however, nor- 
malized by e?,.(7 = 5mm, 8), all €®,,.(7, 8) col- 
lapse onto well-defined radial (v-dependent) 
functions, f,(7,v). This collapse suggests that 
e>,, are separable functions of the form e®, (7°, 0) 
=f,(7, 0) -fo(8, v). This separable nature is also 
true for ey and ey. Although neither of the 
functions f,(7,v) possesses a pure power-law 
form, near the transition, f.(7, v0 = 1.04c,) in- 
creases sharply as 7 approaches the shock 
front, whereas at the larger velocity, f-(7, v0 = 
1.28c;) is significantly “less” singular. 

We now consider the (shear wave-dominated) 
strain fields €$,, behind the Mach cone (Fig. 3, C 


and D, right panels). The same normalization for 
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Measurements were performed at the values of 6 denoted by the colors in (A). 

Main panels: ef, (left) and e§, (right) when normalized by the far-field values 

€x(r = 5mm, 8) for each 8. The black dashed line (right) depicts a 1/r dependence. 
(D) Radial dependences of €\, (8) (left, inset) and €$,() (right, inset) at vy = 1.28cs. 
Main panels demonstrate the collapse of these functions, when normalized as in 


for the dashed lines in (B). 


Ua = 1.04c, produces an approximate collapse to 
a ~1/r form for large 7, but does not collapse 
the strain fields for 7 < 2 mm. At the higher 
velocity, €§,.(7, 8) do approximately collapse to 
the form €,,(7,0)=g,(7, v) - 8o(8,v) (Fig. 3D, 
right), with a nearly linear nonsingular decay 


of g, that is much weaker than the decay of 


é*, for va = 1.04c,. This qualitative behavior is 
echoed by the ey and Coy components. 


The transition to supershear cracks and their 
dynamics are governed by 4 


The energy released by supershear cracks, Ga, 
is measured at v, in the strip geometry. At 
Uy = Ua, translational invariance in 2 exists, 
so for both straight and oscillatory supershear 
cracks, the strain energy density, we(A) (We is 
defined in supplementary text section 5), is 
constant far ahead of the crack tip. Hence, the 
energy released per unit crack advance is 
Ga = We(A) - h. For v < cz, Gy = G(v) =T(v) 
(1). For v < cr, the values of G(v) measured 
via the J-integral, crack tip opening displace- 
ment (CTOD), and strip methods are identical 
(supplementary text section 5). 

In Fig. 4A, we triggered supershear cracks 
for various values of G, by varying h and A. 
Although v, varies with G, for a given A, v, is 
not a universal function of G,; variation of h 
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What drives cracks to be supershear? Figure 
4B indicates that the supershear transition 
takes place at a critical value of A= = 
1.19 + 0.01, for the gel composition used. In 
particular, the dependence of v, on 4 remains 


produces different v, for the same values of 
G,. Thus, in contrast to sub-Rayleigh fracture, 
Fig. 4A demonstrates that the “classical” en- 
ergy balance [G, = T(v)] is not the governing 
principle for supershear dynamics. 


A 14 B 14 
oO h=20mm © Oscillatory crack (W/Wo = 1) 
4.3} 0 h=40mm hs 1.3} OW/Wo = 0.34 ai 
oO h=80mm mo oy O W/W, = 0.58 a 
6 1.2 = H a 12 
< Oo oo = w 


200 300 ‘ 
G, (Jim?) A 
Fig. 4. Dynamics of supershear cracks. (A) Asymptotic crack speeds v, as a function of G, for three different 
system heights, h, of 20, 40, and 80 mm with W/Wo = 0.5 while varying 4. Each data point corresponds 
to a single experiment. cg andc, are denoted by the gray dashed and solid line, respectively. When h is varied, each 
Va corresponds to different values of Gz. This breakdown of the collapse of crack dynamics with G (that is an 
inherent property of LEFM) already takes place within the transition region (cp < v < Cs) to supershear. 
(B) Asymptotic crack speeds va as a function of A. Symbols with the colors in (A) represent the same experiments. 
Purple and black symbols correspond to straight supershear cracks with different values of W/W for h = 20 mm. 
All va collapse to a unique function of a (including va within the transition region), regardless the value 
of W/Wo. Supershear oscillatory cracks (red) collapse onto the same curve. Representative error bars (SD) for 
oscillatory cracks are shown. Orange dashed line denotes the value of A = 1.19 + 0.01 at the supershear transition. 
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Fig. 5. Dependence of supershear cracks on a 
the chemical composition of gels. (A) Asymp- 
totic crack speeds, va, as a function of d for gels 
with different concentrations. Gray dashed and 


* 
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solid lines correspond to cp and cs, respectively. Orange 5” Meg) im?) 
symbols are the data shown in Fig. 4B. Supershear = a8 ss tee 
cracks appear at increased values of A; when A A 25.6+3.3 
either the crosslinker concentration is decreased ost a . dae 
(green and purple) or the amount of monomer is A O 230.7 +102 


increased (blue). When crosslinker and monomer 
concentrations are proportionally varied, va versus 08 09 1 14 12 13 14 
is invariant (red, orange). (B) Strain at the supershear 
transition, A — 1, for gels of different monomer-to- 
crosslinker molar ratios, M. (At values were determined by a spline interpolation at va/c; = 1). Symbol colors 
correspond to the respective gels in (A). A representative error bar (SD) is presented. The black dashed line, with 
a slope of 0.5, is a guide to the eye. (Inset) A linear fit of 4 — 1 with VM (red dashed line) produces a slope of 
0.021 with no intercept. (C) Collapse of va versus curves for gels of different chemical compositions when A is 
normalized by A. A transition in the va versus A relation appears at va~Cr. 
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invariant even when T(v) is changed by vary- 
ing W/W. Even without a weak layer, the 
same critical value of A; = 1.19+0.01 governs 
the transition from subshear to supershear 
oscillatory cracks. Moreover, Fig. 4B demon- 
strates that v, depends solely on); for a given 
gel, all supershear velocities, both for straight 
and oscillatory cracks, collapse to a single, 
well-defined function of 1. No effects of the 
geometry and weak layer height on the dynam- 
ics of supershear cracks were observed (fig. S5 
and supplementary text section 4). Supershear 
dynamics are governed by the stretch level, 
A, not by Ga. 

G must still be greater than I'(c;) for super- 
shear fracture to take place. Even ifA > A;, sub- 
shear propagation will occur when fh is so 
small that [(0) < Ga < I'(¢s) (supplementary 
text section 5 and figs. S6 and S7). As Gg > 
T(cs) is necessary, a critical strip height h. = 
T'(¢s)/We(At), is required to support super- 


shear. For the gels used in Fig. 4,2,+ 12.8 mm ° 


forW = W>.h, decreases proportionally with 
W/Wo. 

What controls the value of A;? To address 
this question, we performed experiments 
on samples composed of varying monomer 
and crosslinker concentrations (Fig. 5). M, the 
monomer-to-crosslinker molar ratio, deter- 
mines the number of polymer segments be- 
tween crosslinks (supplementary text section 
6). Figure 5A shows that both % and the 
overall dependence of v, on i are unaffected 
when monomer and crosslinker concentra- 
tions are increased proportionally (thus fix- 
ing M while appreciably varying both the 
elastic moduli and T; table SI). When, how- 
ever, M is changed by separately varying the 
monomer or crosslinker concentrations, the 
relation between v, and i systematically var- 
ies. In particular, Fig. 5B shows that the strain 
at the transition, 4; — 1, scales approximately 
linearly with VM with a proportionality factor, 
a~0.021. Figure 5C demonstrates that nor- 
malizing A by A; collapses all of the vag - A 
relations for different gels to a single curve 
with a sharp transition from subsonic cracks 
to the supershear branch at 4/A; = 1. The 
slight divergence for gels with high mono- 
mer concentrations may be due to extensive 
polymer entanglement, which also produces 
high T (30). 


Discussion 


Figure 5B demonstrates that A;, the macro- 
scopic critical stretch at the supershear tran- 
sition, depends critically on a microscopic 
quantity, M, that characterizes the gels’ in- 
ternal structure. Fracture mechanics enables 
us to relate the applied stretch to a micro- 
scopic scale, the cohesive zone size, 5;. We 
will now show that this connection provides 
us with a way to understand the A, — 1°. /M 
scaling presented in Fig. 5B. 
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We consider the simplest cohesive zone 
model, the Dugdale model (7) 


5 = T(¢s) = We(Mt)Ne 


Oc Oc 


(1a) 


where o¢ is the maximal stress that the mate- 
rial can sustain. Replacing w.(At) in Eq. la by 
its leading order term, 3u/2(A4 — 1)? and sub- 
stituting the scaling result suggested by Fig. 5B, 
Mt —1< aM, yields 


3u0°h.M 
5 2 


Qo. (1b) 
Equation 1b predicts that 6; is proportional 
tod = Mb/2, the unfolded (stretched) polymer 
length between two crosslinks. The propor- 
tionality factor is 31.07h,/(ocb), where bis the 
length of a monomer unit. The fact that 6, and 
d scale in the same way with M and 3u07h,/ 
(ocd) ~ O(1) (table S1 and supplementary ma- 
terial section 6) suggests that these two os- 
tensibly independent quantities are related. 
Hence, the scaling of A — 1 with MM sug- 
gests a transition of the polymer structure 
within the cohesive zone, from coiled polymer 
chains (whose lengths scale as \/M/) to stretched 
chains whose strength, o,, is determined by 
the internal forces within the polymer chains 
(3D). This transition may both explain the heu- 
ristic scaling presented in Fig. 5B and could 
provide a clue toward a physical picture of the 
transition from classical fracture to supershear. 
Fracture mechanical solutions for supershear 
tensile cracks do exist in the literature (J), al- 
though they are considered to be nonphysical 
on energy grounds. These solutions exhibit 
qualitative behavior that is very different from 
what we experimentally observe here. Both 
ahead and behind the Mach cone, the theo- 
retical solutions are singular (~1/r?), with the 
same singularity, g(v). Moreover, these solu- 
tions predict g(v) to be g ~ 0 at the transition 
to supershear with an increase to a maximal 
value of g = 1/2 atv = \/2¢;. By contrast, our 
experiments clearly show that the observed 
supershear cracks have very different radial 
dependences on both sides of their Mach 
cones. In addition, although the measured 
fields are not truly singular, they have their 
strongest r dependence when propagating at 
speeds close to cs. Beyond ¢,, the fields’ radial 
dependence then weakens with increasing v. 
There also exist numerical solutions for 
supershear cracks in hyperelastic materials in 
which the local wave speed within the highly 
stretched region at the crack tip is higher than 
the wave speeds in the surrounding bulk mate- 
rial (27). In such cases, cracks are “locally” sub- 
sonic, whereas crack velocities appear to be 
beyond c¢, to material far from the crack tip. 
Such effects have been recently observed in 
shear fracture (32). The supershear branch 
described in our experiments does not ap- 
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pear to correspond to these types of cracks 
because, at the strains applied in our experi- 
ments, the gels used are well-described by neo- 
hookean constitutive relations—even within 
the weak layers up to A= 1.8 (fig. S1). In neo- 
hookean materials, it is known that c, is in- 
variant with stretch (23, 28), and we have 
verified this with direct measurements (sup- 
plementary text section 2 and fig. S3). 

Our results therefore suggest that, as Marder 
(23) predicted, an entirely different branch of 
solutions for propagating cracks exists. More- 
over, these results suggest that the supershear 
solution branch is both entirely general and 
independent of the microscopic material struc- 
ture; the elastic gels in the experiments and the 
brittle lattices in the numerics possess wholly 
different microscopic properties. Even if the 
experimentally observed supershear states cor- 
respond to the branch of solutions observed in 
these lattice models (22-25), many important 
fundamental questions arise that challenge 
our fundamental understanding of fracture. 

For example, our results imply that the tran- 
sition mechanism to supershear takes place 
within the small scales surrounding the crack 
tip. The polymer-stretching transition sug- 
gested here is, however, certainly not realized 
in lattice models. As such, it is unclear what 
general mechanisms within the dissipative re- 
gion surrounding the crack tip give rise to the 
supershear transition. 

There are also puzzles at large (system-size) 
scales. In contrast to supersonic cracks driven 
by extreme (explosive) loading rates (33), the 
supershear cracks described here are driven 
by the release of elastic energy that is stored 
within the macroscopic scales that are roughly 
defined by h,. h, is both insensitive to local 
stretch along the crack path and is a scale that 
is much larger than both the crack tip region 
and/or the weak layer scale. An important 
question is, therefore, by what mechanism does 
the material supporting these rapidly propagat- 
ing states self-organize, to transport the strain 
energy within h<h, to the dissipative region at 
supershear crack tips? This mechanism must 
be wholly different from that described by clas- 
sical fracture mechanics; a “classical” crack is 
unable to transport energy to its tip for v=cr. 
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Climbing fiber multi-innervation of mouse Purkinje 
dendrites with arborization common to human 


Silas E. Busch’ and Christian Hansel?* 


Canonically, each Purkinje cell (PC) in the adult cerebellum receives only one climbing fiber (CF) from 
the inferior olive. Underlying current theories of cerebellar function is the notion that this highly 
conserved one-to-one relationship renders Purkinje dendrites into a single computational compartment. 
However, we discovered that multiple primary dendrites are a near-universal morphological feature in 
humans. Using tract tracing, immunolabeling, and in vitro electrophysiology, we found that in mice ~25% 
of mature multibranched cells receive more than one CF input. Two-photon calcium imaging in vivo 
revealed that separate dendrites can exhibit distinct response properties to sensory stimulation, 
indicating that some multibranched cells integrate functionally independent CF-receptive fields. These 
findings indicate that PCs are morphologically and functionally more diverse than previously thought. 


nputs to the cerebellar cortex are integrated 

by the dendrites of Purkinje cells (PCs), its 

sole cortical output neuron. Despite their 

well-characterized position in what is consid- 

ered a conserved and stereotypical circuit 
(1), PCs exhibit diverse dendritic morphology 
in rodents (2) and it is not known how specific 
features of dendritic arborization may affect 
their function. 

Human PC morphology remains even more 
elusive. Studies of human PC morphology, 
which date back more than 120 years to the 
iconic illustrations of Golgi and Ramon y Cajal 
(3, 4), typically investigate small numbers of 
cells (5-7). Although no quantitative infor- 
mation on frequency and distribution of mor- 
phological types is available, it can be observed 
that human PCs are often “multibranched,” 
having either numerous trunks emerging from 
the soma or a proximal bifurcation of a single 
trunk. These features produce highly segregated 
dendritic compartments, raising the question 
of whether this confers functional properties 
that have gone unreported. 

We specifically asked whether the existence 
of several primary dendrites enables multiple 
climbing fiber (CF) innervation in the adult 
cerebellum. During development, the early 
growth of a primary dendrite provides struc- 
tural support for the ramification of a “winner” 
CF amidst competitive elimination of surplus 
CFs (8-10). Weaker CF inputs fail to translocate 
to the dendrite, possibly as a result of compet- 
itive processes resembling adult bidirectional 
synaptic plasticity (27-14). In PCs where mul- 
tiple primary dendrites conceivably offer a 
means to evade competition from other CFs, is 
the elimination pressure reduced enough to 
allow multiple CFs to be maintained? Would 
multi-innervation provide functionally inde- 
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pendent receptive fields to distinct dendri- 
tic compartments? 


A majority of human, but not murine, PCs 
have multiple primary dendrites 


We used fluorescent calbindin immunolabel- 
ing to visualize PCs in postmortem human 
tissue (Fig. 1A). Based on proximal primary 
dendrite structure, which articulates the con- 
tours of the entire arbor, we define one stan- 
dard structural category—Normative, in which 
one primary dendrite may have a distant bifur- 
cation (beyond a two-somatic diameter thresh- 
old of 40 itm in mice and 50 to 70 um in humans)- 
and two multibranched categories—Split, in 
which there is one trunk that bifurcates into 
multiple primary dendrites proximal to the soma 
(below the somatic diameter threshold) and 
Poly—, in which multiple trunks emerge directly 
from the soma (Fig. 1A and fig. S1; see mate- 
rials and methods). Although these categories 
translate to mice (Fig. ID), we found that mice 
diverged significantly from humans in that they 
had fewer Split PCs (35.9 versus 44.8%) and far 
fewer Poly PCs (16.6 versus 51.2%; Fig. 1G and 
fig. S2A). Instead, in mice Normative PCs con- 
stituted the largest PC category (47.5%) in con- 
trast to humans (4.0%). 

We manually marked the distribution of 
dendritic morphologies of collectively ~8000 
cells across whole parasagittal reconstructions 
of brain slices from the mid-hemisphere in hu- 
mans and mice (Fig. 1, B and E, and fig. $2, B 
and C). In posterior lobules of humans, there is 
a higher percentage of Poly PCs (53.8 versus 
40.9%) and a lower percentage of Normative 
PCs (3.5 versus 6.1%) than in anterior lobules 
(Fig. 1C, fig. S2A, and table S1). Although the 
total rate is far lower, Poly PCs are relatively 
more prevalent in posterior lobules of mice as 
well (21.2 versus 10.3%; Fig. 1, F and G, fig. S2A, 
and table S2). 

The broad morphological distributions were 
consistent across nonpathological human and 
mouse individuals (fig. S2, B and C, and tables 


q 


Sl and S2) and did not depend on zonal pat! Se 
ing by zebrin II expression (15) (fig. S3). Phyo 
constraints, however, might play a role in the 
spatial distribution of PC dendrite morphologies 
by foliar subregion (7). Indeed, the multiple 
dendrites of Split and Poly PCs predominantly 
ramified in a horizontal orientation (defined 
by <30° angle deviation from the PC layer, fig. 
S4, A and B; see materials and methods) in the 
sulcus of human folia (80%) while this occurred 
much less frequently in the bank and gyrus (23 
and 25%, respectively; fig. S4D). This effect was 
not present in mice (fig. $4, C and D). In human 
PCs—where dendritic size expands strongly 
(Fig. 1, A and D)—the horizontal orientation 
follows the inward curvature of the sulcus, pos- 
sibly indicating a developmental response to 
physical constraints on growth. Because the 
physiological implications cannot be readily 
studied in humans, we turn to the corresponding 
mouse cells for further characterization. 


Multiple CFs may innervate separate 
primary dendrites 


CF activity causes complex spike firing in PCs 
(6, 17), which is reciprocally related to simple 
spike firing (77, 18) and exerts powerful control 
over dendritic integration and PF plasticity ‘ 
(19-26). Though many studies cite the critical 
importance of one-to-one CF to PC connec- 
tivity in cerebellar function, as well as abnormal 
connectivity in dysfunction, some work has + 
shown CF multi-innervation in ~15% of PCs in 
adult rodents (27-29). 

To test whether multiple CF innervation can 
be found in mature PCs, we combined a sparse 
dextran tracer (DA-594) labeling of inferior 
olivary (IO) neurons (Fig. 2A) with immuno- 
labeling of CF terminal boutons (VGluT2) and 
PCs (calbindin). Because all CF terminals are 
marked by VGluT2, but only some will express 
DA-594, this method allows for the identifica- - 
tion of multiple CF inputs from distinct IO neu- ‘ 
rons onto single PCs (9, 30). Figure 2B shows a 
Poly PC (P87) that was indeed innervated by 
two CFs on its separate primary dendrites. . 


Quantification of CF multi-innervation in 
mature PCs 


We obtained a quantitative measure of CF 
multi-innervation across the PC population by 
using whole-cell patch-clamp recordings in 
murine cerebellar slices (Fig. 2C). We adjusted 
current intensity and stimulus electrode posi- 
tion in the granule cell layer—subadjacent to 
each patched PC—to identify any ascending CF 
inputs and their stimulus thresholds. Mono- 
innervated PCs had a single, discrete excitatory 
post-synaptic current (EPSC) amplitude whereas 
multi-innervated PCs exhibited two or more dis- 
crete EPSC amplitudes selectively evoked by 
distinct stimulus intensities (Fig. 2C, bottom, 
and fig. S5, A and B). About 15% of all PCs in 
mature animals (P20-66) received multiple CFs 
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A 


Normative 


Fig. 1. Comparative morphology and regional variability in human and 
mouse cerebellar PCs. (A) Immunolabeling of PCs in humans reveals a range of 
dendritic morphologies, categorized by primary dendrite geometry as Normative, 
Split, or Poly. (B) Human mid-hemisphere reconstruction demonstrating the 
spatial distributions of each morphological type. As a result of variable 
preservation of tissue some anterior lobules and intervening posterior sub- 
lobules had a lower density of labeled PCs. (C) Morphology demographics 
across lobules (n = 3 individuals >86 years old, 6640 cells; see table S1). 


(Fig. 2D, left). CF competition for survival is 
complete by P20 (8, 9, 28, 31). In keeping with 
this, we did not find an effect of age on the rate 
of multi-innervation (fig. S6L). 

Combining this technique with fluorescent 
dye loading and confocal imaging revealed that 
multi-innervation was largely restricted to PCs 
with multi-branched structures (23/24 PCs) and 
occurred in ~25% of cells in this group (1/64 
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Normative, 15/61 Split, and 8/34 Poly PCs; Fig. 
2D). The summed CF EPSC of multi-innervated 
PCs was larger, on average, than the amplitude 
of individual CF inputs to mono-innervated PCs 
(Fig. 2E). The amplitude of the smaller CF (at 
-30 to -10 mV holding potential) was typically 
>200 pA (Fig. 2E). This indicates that, under 
physiological membrane potentials, even the 
weakest of multiple CFs will likely deliver 
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(D) PCs filled with dye during a patch experiment in mice to scale with human 
cells also exhibit Normative, Split, and Poly morphology. (E and F) as in (B and 
C), but in mice (n = 3 mice >P50, 1350 cells; see table S2). (G) Morphological 
category distribution counted by lobule (top) in human (n = 20, 21, and 

21 lobules) and mouse (n = 30, 30, 29) reveals a consistent increase in the 
number of Split and Poly PCs in human matching the rates of the whole cell 
population (bottom). Average lines depict median lobule value. *P < 0.05, 

**P < 0.01, ***P < 0.001. 


sufficient current to the soma to influence out- 
put (32). The amplitude of weaker CFs in- 
creased with age (fig. S5M), which may denote 
a delayed or elongated maturation period of 
these inputs relative to the completed develop- 
ment of single CF inputs or the more dominant 
of multiple CFs (fig. SSN). The relative EPSC am- 
plitude ratio between dominant and smaller CFs 
varied widely, but smaller CFs most often had 
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Fig. 2. CF multi-innervation of mature multibranched PCs. (A) Schematic of tracer (DA-594) injection. 
(B) A Poly PC after immunolabeling for PCs (calbindin) and CF terminals (VGluT2). The tracer label 
distinguishes CFs with distinct olivary origin on the left and right trunks. (©) Scheme of whole-cell patch- 


clamp in cerebellar slices and CF EPSCs recorded from either a mono- or multi-innervated PC. (D) Number 


of mono- versus multi-innervated PCs as a combined population (left). Categorizing by morphology reveals 
that effectively all multi-innervation occurs in multibranched PCs (n = 50 animals, 159 cells). (E) Summed 


multi-CF EPSCs are larger than mono-CF EPSCs (n = 135 and 24 cells). The weaker of multiple CFs typically 


provides >200 pA signals. Holding potential: -10 to -30mV (n = 24 cells). (F) Multi-innervated PCs have 
earlier dendrite bifurcations (n = 135 and 24 cells). (G) Among PCs with a bifurcated primary dendrite, 
mu 
Poly PCs have a wider angle between emerging trunks (n = 26 and 8 cells). Summary points indicate 
mean + SEM. *P < 0.05, **P < 0.01, ***P < 0.001. 
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ti-innervated cells have a wider distance between compartments (n = 85 and 9 cells). (H) Multi-innervated 


>25% the relative amplitude of the dominant 
CF (fig. S50). This ratio differed across foliar 
sub-areas (fig. S5P) and correlated with the angle 
between Poly PC trunks (fig. S5Q), further em- 
phasizing the relationship between morphol- 
ogy and CF input properties. 

The prevalence of multi-innervation was 
correlated with proximity of bifurcation and 
angle of separation between emerging trunks 
in Split and Poly PCs, respectively (Fig. 2, F 
and H, and fig. S5, D and F). Multi-CF PCs also 
had wider dendritic arbors in the parasagittal 
plane (Fig. 2G and fig. S5G), but did not differ 
in the angle of bifurcation (fig. S2E) or soma 
size (fig. S5H). 

CF multi-innervation was present across cere- 
bellar regions and foliar sub-areas (fig. S6, A 
to I). Posterior lobules had a higher frequency 
of multi-innervation (fig. S6]), possibly due to 
increased prevalence of Poly PCs (fig. S6, H and 
J), matching our finding in immunolabeled tis- 
sue (Fig. 1F and fig. S2A). We did not observe 
a preferential rate of multi-innervation within 
the sulcus as a general pattern [but see (29) for 
more detailed analysis within vermis]. 


CF multi-innervation produces heterogeneous 
Ca** signals across dendrites in vivo 


Do multiple converging CFs provide function- 
ally distinct inputs to a single PC? How would 
this affect dendritic signaling in vivo? To answer 
these questions, we examined whether CF multi- 
innervation produces heterogeneous Ca”* sig- 
nals across separate dendritic branches. CF 
input triggers massive Ca?* entry into PC den- 
drites through voltage-gated Ca?* channels (33), 
NMDA receptors (34), and release from internal 
stores (35), which can be locally modulated by 
ion conductance plasticity (33, 36) and inter- 
neuron inhibition (37, 38). These mechanisms 
contribute to the calcium events that we moni- 
tor here in vivo and to their modulation (39-4D. 

We obtained a sparse PC expression of the 
Ca?* indicator GCaMP6f and used two-photon 
imaging of mice in a state of quiet wakefulness 
(awake; no detected motion) to record non- 
evoked “spontaneous” Ca”* signals from primary 
dendrite compartments in small populations 
of <10 cells (Fig. 3, A and B; see materials and 
methods). Volumetric scans visualized cellular 
morphology and permitted manual tracing of 
compartment ROIs (Fig. 3, B and D, and Figs. 
4A and 5A) so fluorescence signals were ex- 
tracted and deconvolved separately to contrast 
event amplitude and frequency across branches 
(Fig. 3, C and E). In this configuration, non-evoked 
Ca?* signals beyond the micro-compartment 
scale are almost entirely CF-dependent (42, 43) 
and Ca?* event amplitude reflects the number 
of spikes in the presynaptic CF burst (44). This is 
confirmed by our observed ~1.2 Hz spontaneous 
Ca?* event frequency (fig. S7D) that matches an 
expected CF input frequency moderately greater 
than 1 Hz (17). 
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Fig. 3. Two-photon imaging in vivo reveals Ca2* signal heterogeneity 
across PC dendrites. (A) Schematic of experimental preparation. 

(B) Example imaging plane and three-dimensional reconstruction of a Poly PC. 
(C) Spontaneous signal and deconvolved events (circles) by branch with 
difference trace below demonstrates heterogeneous global event amplitude scale 
and branch-specific events. (D and E) Another recording from a Normative and 
Split PC highlights homogeneous versus heterogeneous signaling. (F) Local 
events are moderately smaller than global events (n = 15 animals, 95 cells). 
(G) Branch-specific local events as a percentage of total events in each cell by 


We first identified local Ca?* peaks detected 
in only one branch of each cell (Fig. 3, C, F, and G 
and movie S1), which were moderately smaller 
than globally expressed events on average (Fig. 
3F). We also compared the inter-event cross- 
correlation of Ca”* events across branches, for 
which the fit and significance of a linear re- 
gression describes the interbranch covariation 
(Fig. 3, H and I). 

Most PCs had homogenous Ca”* signals 
with linear inter-event covariance relationships 
across branches (Adj R? > 0.1) and low numbers 
of local events (Fig. 3, G and H). However, some 
PCs exhibited Ca?* signal heterogeneity char- 
acterized by a linear regression of inter-event 
covariation with low Adj R? < 0.1 that was not 
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significant (0, 15, and 38% of Normative, Split, 
and Poly PCs, respectively; Fig. 3, H and I) ora 
higher ratio of local events (17.4, 36.6, and 51%; 
Fig. 3G). High variability of inter-event ampli- 
tude scale between branches, another measure 
of heterogeneity, correlated with the bifurcation 
distance and total parasagittal dendritic width 
(Fig. 3, J and K, and fig. S7C). This further links 
heterogeneity to underlying morphological con- 
tours defined by primary dendrite geometry. 
Confirming that local events are the product of 
additional CF input, PCs with high local event 
rates had higher mean (fig. $7, G, H, and P) 
and maximum total event rates (fig. S7, L and 
M), producing a larger dynamic range (fig. S7, N 
and QO). Our observations link the occurrence of 
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morphology (n = 15 animals; n = 32, 55, and 8 cells). (H) Linear regressions 
on branch cross-correlation quantifies branch similarity (left). Model fit R values 
(right) reveals that cells with low branch signal similarity are predominantly 
Split and Poly PCs (n = 32, 55, and 8 cells). Bordered points indicate 


(I) Cells lacking detectable relationship using 
amplitudes are all Split or Poly PCs. (J and K) 
ation by Split distance (n = 105 cells) and total 
lendrites (n = 109 cells). Summary points 


indicate mean + SEM. *P < 0.05, **P < 0.01, ***P < 0.001. 


local CF events to underlying morphological 
contours defined by primary dendrite geometry, 
although other factors, such as inhibition by 
molecular layer interneurons (MLIs), are likely 
to contribute as well (37, 42). 


CFs convey distinct whisker receptive fields to 
separate primary dendrites 


To identify CF receptive fields (RFs) and their lo- 
calization on PC dendrites, we took advantage of 
the discrete organization of whiskers as a sen- 
sory input array (45, 46). We anaesthetized ani- 
mals to stimulate untrimmed individual whiskers 
at 2 Hz for 50 s periods while recording Ca”* 
activity of PCs in medial Crus I (Fig. 4A and fig. 
S8A; see materials and methods). Most PCs had 
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Fig. 4. Branch-specific whisker receptive fields produced by CF multi-innervation of multibranched 
PCs. (A) Schematic of the imaging configuration and whisker stimulation under anesthesia. (B) Sample 
traces and deconvolved events by branch during 50 s whisker stimulation. Each whisker is tested twice; 
data from both periods are combined. Responsiveness of one branch and not the other drives an enhanced 
local event rate in Bl during the stimulus period. (©) Mean number of local branch events in response 
windows during stimulus periods of each tested whisker (n = 13 animals P95-120, n = 33, 112, and 24 cells). 
(D) Difference in local event number between whiskers eliciting maximum and minimum local responses 
(n = 33, 112, and 24 cells). (E) Schematic of global versus lateralized responses. (F) Percentage of PCs by 
dendritic response profile and morphological category. Fewer Normative PCs have lateralized responses than 
multibranched PCs (n = 169 cells). (G) Cells with lateralized responses have shorter Split distances 

(n = 75, 42, and 52 cells). (H) Cells with more spontaneous local events respond to a higher number of 
whiskers (n = 151 cells). Summary points indicate mean + SEM. *P < 0.05, **P < 0.01, ***P < 0.001. 
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only global events identically represented across 
primary dendrites (figs. S8, A2, and A3). How- 
ever, some PCs had high numbers of local events 
in response windows during the stimulus pe- 
riod (Fig. 4, B and C, and fig. S8 A4) that varied 
in magnitude between distinct whisker stimuli 
(Fig. 4D), indicating RF selectivity. 

Anaesthetized activity is sparsened, so re- 
sponses were determined using the z-scored 
response probability during the experimentally 
bootstrapped high-frequency stimulus (fig. S8, 
B to D). Comparing the z-scored response 
probability of each dendrite, we observed a 
“Jateralized” response in some PCs, in which 
local events of one branch constituted a whisker 
response not observed in the other branch 
(Fig. 4, E and F). Nearly all lateralized responses 
arose in Split and Poly PCs (48/52 cells, 92%; 
Fig. 4F) and in PCs with more spontaneous 
signal heterogeneity, which map to Split and 
Poly PCs as in previous experiments (fig. S8F). 
Furthermore, PCs with lateralized responses 
had more proximal dendrite bifurcations than 
PCs with only global responses (18.73 um versus 
28.24 um; Fig. 4G). Notably, PCs with higher rates 
of branch-specific spontaneous events also ex- 
hibited responses to more whiskers, denoting 
an integration of more whiskers into their RFs 
(Fig. 4H). This supports the hypothesis that 
heterogeneous signals represent distinct, con- 
verging RFs such that heterogeneous PCs are 
sampling more upstream RFs carried by func- 
tionally independent CF inputs. 


CF-induced branch-specific representations 
of stimulus modality in awake mice 


Although anesthesia provided excellent con- 
trol and precision for single whisker stimula- 
tion, even subanesthetic ketamine alters network 
activity (47). To confirm that PC primary den- 
drites can differentially represent CF RFs in a 
more naturalistic state, we exposed awake ani- 
mals to uni- and multisensory stimuli (Fig. 5A). 
As a major hub for sensory integration during 
associative learning, PC dendrites are an impor- 
tant model for how converging input profiles are 
represented across dendrites. The amplitude 
and duration of CF-induced dendritic Ca”* spikes 
depend on stimulus strength (43, 48), which 
is reflected in CF burst behavior (44) and also 
on synaptic connectivity and weight of the CF 
input itself (49). 

We stimulated awake animals with light 
(488 nm, ipsilateral), sound (12 kHz tone, bila- 
teral), and peri-oral air puff (10 psi, ipsilateral) 
stimuli either alone or in multimodal combi- 
nations while recording response properties 
in PC primary dendrites. Sensory-evoked events, 
more than spontaneous, typically produced a 
global dendritic signal with consistent intertrial 
amplitude ratio between branches (Fig. 5B1). 
However, we also observed complex sensory- 
evoked bursts of CF input with heterogeneous 
amplitudes between branches (Fig. 5, B2 and 
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Fig. 5. Branch-specific multisensory receptive fields. (A) Scheme of imaging and sensory stimulation 

of awake animals. (B) Sample traces showing combinations of inter-branch responses to different stimulus 
modalities. (€) The maximum number of local events observed for a stimulus of any category in Normative vs 
Split and Poly (S/P) PCs (here and below: n = 12 animals, n = 24 and 38 cells). (D) The percentage of 
responses having a local component, regardless of branch identity, across control (C), uni-, or multisensory 
trials. Lines connect values for each PC. (E) Calculation of ABR (top) between the stimulus types most 
favoring opposite branches. ABR values of each modality are calculated for each cell (bottom, schematic 
points) to map the ABR profile across stimuli and identify the mean and range. (F) The range is more 
pronounced in S/P cells (n = 24, 38). (G) No group difference in ABR mean (n = 24, 38). (H) Response profile 
bilaterality is the subtraction of ABR mean from the range. S/P PCs exhibit more bilaterality due to high 
ranges and low means (n = 24, 38). Summary points indicate mean + SEM. *P < 0.05, **P < 0.01, ***P < 0.001. 
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B3) and either branch-specific responses alone 
(Fig. 5B4) or combined with a global response 
(Fig. 5B2). Whereas PCs with multiple primary 
dendrites [Split and Poly (S/P)] had similar 
total response probabilities as Normative PCs 
(fig. S9C), a larger share of responses were 
branch-specific in S/P PCs across stimulus 
modalities (Fig. 5, C and D, and fig. S9, A and B). 
To assess the relationship between uni- and 
multimodal stimuli, we identified the maximum 
branch-specific responses to stimuli of each 
category (fig. S9D), obtained the difference 
between uni- and multisensory maxima, and 
found an enhanced rate of local responses in 
S/P but not Normative PCs (fig. S9F). This re- 
vealed that multisensory stimuli could enhance 
the differential representation of CF RFs across 
primary dendrites in putatively multi-innervated 
PCs while failing to influence mono-innervated 
Normative PCs. 

While the previous analyses were blind to | 
branch identity, we next asked how much the 
differential representation of each stimulus 
could favor one branch over the other (fig. S9E). 
We generated a A Branch Response (ABR) index 
for each stimulus modality by calculating the 
difference in branch-specific, local responses 
as a fraction of total responses (Fig. 5E, top). 
Absolute ABR indicates the reliability of local 
responses on either branch whereas the sign of 
the ABR indicates which branch over-represented 
the modality. This allowed us to generate a pro- 
file of branch-specific representation across all 
stimulus modalities, which could be quantified 
by the ABR mean and range for each cell (Fig. 5E, 
bottom). In this way, PCs could be distinguished 
as having one of three classes of multisensory 
response profiles: global, with identical repre- 
sentation across branches in all cases; unilateral, 
with one branch exhibiting a larger RF repre- 
sentation than the other; and bilateral, with 
both branches capable of differentially repre- 
senting unique stimulus modalities. 

On average, S/P cells had a wider range, de- 
noting branch-specific (e.g., unilateral or bilat- 
eral) representations that were more distinct 
across modalities (Fig. 5F and fig. S9, G to I). 
Cells for which only one branch exhibits local 
responses—that is, unilateral—would have both 
a large ABR range but also a ABR mean that de- 
viates from zero to favor that branch. To better 
characterize whether some PCs had bilateral 
representation profiles, we calculated the bilat- 
erality of the RF profile by subtracting the 
ABR mean from the range. The local responses 
of S/P cells, more than Normative, produced 
RF profiles wherein a larger percentage of local 
signaling produced bilateral representations 
across sensory modalities (Fig. 5H and fig. S9, 
G, H, and K). Collectively, this shows that PCs 
with multiple primary dendrites can differen- 
tially represent RFs of distinct CF inputs across 
their separate dendrites in awake, mature mice 
(for a summary of heterogeneous signaling 
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observed in our two-photon recordings across 
the three dendrite morphologies in mouse PCs, 
see fig. S10). 


Discussion 


We found that noncanonical CF multi-innervation 
of PCs does occur in the mature murine cere- 
bellum and is dependent on primary dendrite 
morphology. Nearly all observed multi-innervation 
occurred in neurons with multiple primary 
dendrites. Based on a quantitative categoriza- 
tion of >6000 PCs from three human brains, 
we report that this type of PC dendritic struc- 
ture is predominant in the human cerebellum. 
By contrast, we detected that only a minority 
of murine PCs fall into the Split or Poly cate- 
gory. Within these morphological groups, about 
25% of PCs were innervated by two or more CFs 
in the mouse. 

Our two-photon recordings suggest that most 
multi-innervated PCs have the capacity for 
branch-specific CF signaling and have distinct 
CF RFs. Our data do not allow us to conclude 
that the same results would be found in hu- 
man PCs if such recordings were possible. 
However, they do describe a new motif in PC 
dendritic compartmentalization: separate den- 
dritic subfields with their own assigned CF 
inputs may emerge when early branching forms 
a multibranched architecture. Variable CF burst 
frequency and modulation of CF input ampli- 
tude by MLIs may further contribute to com- 
partmentalization (37, 42). 

CFs provide instructive signals in cerebellar 
function and plasticity (17) by encoding signals 
related to error (19, 50), sensory omission (57), 
as well as reward or reward-prediction (52, 53). 
Our findings constitute a substantial shift from 
the currently held belief that one CF innervates 
each PC. Instead, our observations suggest that 
one CF innervates each primary dendrite. 

The consequences for dendritic integration 
and, ultimately, the activation of target cells in 
the cerebellar nuclei (54) are potentially multi- 
fold. Here, we discuss those that immediately 
result from geometric considerations. Multi- 
branched structure often increases dendritic 
width in the sagittal plane (Fig. 3K), in some 
cases even opening a cleft between compart- 
ments (see Fig. 1, A and D, Figs. 3B and 46, fig. 
S1, B and C, and fig. S4B). This configuration 
inevitably leads to a wider physical gap between 
innervating PF bundles and thus to a potential 
functional separation of the contextual infor- 
mation they provide (55, 56). It is therefore 
conceivable that PCs drive spike output from 
a multitude of contextual input combinations 
that expand with increased dendrite size and 
complexity. Multiple CF innervation that, as 
we describe here, occurs at an elevated rate in 
multibranched PCs may serve several critical 
purposes. First, it may enhance PC function as 
a supervised associative learning perceptron 
that optimizes synaptic weights (57) by provid- 
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ing RF-matched CF inputs—and thus relevant 
errors and instructive signals—to the different 
PF inputs that convey specific contextual infor- 
mation (fig. S11A). In this way, the perceptron 
orchestrates synaptic weight optimization based 
on compartmentalized, rather than all-dendritic, 
instructive signals. Thus, in receiving multiple 
CF inputs, some PCs are permitted to fully capi- 
talize on diverse context representations surveyed 
by their multibranched architecture. Second, 
multiple CF innervation may enable more com- 
plex PC computations, such as multiplexing 
and conveying input from a wider array of 
sensory modalities (fig. S11B). In this scenario, 
individual multibranched PCs may pair differ- 
ent contexts presented by their PF inputs with 
instructive information collected from a multi- 
modal environment. 

In both the human and mouse cerebellum, 
multibranched PCs are more prevalent in the 
posterior cerebellar hemisphere, a region linked 
to cognitive and affective roles (58-60). Whether 
or not the multibranched architecture enables 
such complex functions through the gained 
computational power that is postulated here 
needs to be investigated in future studies. 

On the other hand, excessive CF-PC strength 
resulting from disrupted synaptic pruning and 
ectopic innervation of distal PC dendrites is 
linked to pathological dysfunction in autism 
model mice (27, 49) mouse and human essen- 
tial tremor (67). The preferential targeting of 
distinct primary dendrites by multiple CFs (Fig. 
2B) may bring computational advantages while 
avoiding disruptive enhancement of CF inputs 
to individual dendritic compartments. 
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QUANTUM SIMULATION 


Observation of universal Hall response in strongly 


interacting Fermions 


T.-W. Zhou?, G. Cappellini’, D. Tusi°, L. Franchi’, J. Parravicini*°, C. Repellin*, S. Greschner®, 


M. Inguscio®?%, T. Giamarchi®, M. Filippone’, J. Catani°, L. Fallani 


*1,2,3% 


The Hall effect, which originates from the motion of charged particles in magnetic fields, has 

deep consequences for the description of materials, extending far beyond condensed matter. 
Understanding such an effect in interacting systems represents a fundamental challenge, even 

for small magnetic fields. In this work, we used an atomic quantum simulator in which we tracked the 
motion of ultracold fermions in two-leg ribbons threaded by artificial magnetic fields. Through 
controllable quench dynamics, we measured the Hall response for a range of synthetic tunneling and 
atomic interaction strengths. We unveil a universal interaction-independent behavior above an 
interaction threshold, in agreement with theoretical analyses. The ability to reach hard-to-compute 
regimes demonstrates the power of quantum simulation to describe strongly correlated topological 


states of matter. 


ince its first observation in 1879 (J), the 

Hall effect has been an extraordinary 

tool for understanding solid-state sys- 
tems (2). This phenomenon is a macro- 
scopic manifestation of the motion of 
charge carriers in materials subjected to a 
magnetic field B, generating an electric field 
Ey, perpendicular to the longitudinal current 
J, flowing in the system. At a small magnetic 
field, the Hall coefficient Ry = E,/(BJz) per- 
mits the extraction of the effective charge q 
and carrier density n, because Ry ~ —1/nq 
in conventional conductors. The Hall effect 
has widespread applications in metrology 
and materials science, such as sensitive mea- 
surements of magnetic fields and resistance 
standards based on its quantized behavior 
at large B (3). The modern understanding 
of the Hall effect establishes it as a mani- 
festation of robust geometric properties of 
quantum systems: Fermi-surface curvature 
of metals under weak magnetic fields (4, 5), 
Berry curvature of anomalous Hall systems (6), 
and topological invariants of band insulators 
(7). Studies of Hall responses are ubiquitous in 
fields that address topological quantum mat- 
ter (8) and synthetic realizations thereof (9, 10). 
However, when interactions are present 
among the carriers, understanding the Hall 
coefficient becomes a theoretical challenge. 
At large magnetic fields, interactions lead to 
the fractional quantum Hall effect (1D), where 
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the quantization of Ry,B to fractions of h/e* 
(where / is Planck’s constant) reveals the emer- 
gence of elementary excitations with fractional 
charge and anyonic statistics (12, 13). For small 
fields, the connection of Ry with carrier den- 
sities and topological invariants is lost, lead- 
ing to numerous theoretical attempts (14-21) 
to understand the effects of many-body corre- 
lations on this quantity. This complicates the 
interpretation of measured anomalous tem- 
perature dependence and sign changes of Ry; in 
the normal phase of cuprates (22, 23), dis- 
ordered superconducting films (24), and organ- 
ic compounds (25, 26). Numerical progress 
has recently allowed a reliable calculation 
of the Hall coefficient (27) in a quasi-one- 
dimensional (1D) geometry and predicted a 
threshold of interactions above which the Hall 
coefficient becomes interaction independent 
and thus universal. 

In this context, ultracold atoms in optical 
lattices provide an opportunity to gain insight 
into the fundamental aspects of interacting 
Hall systems, owing to their flexibility and con- 
trollability. A notable recent advance was the 
realization of artificial magnetic fields in op- 
tical lattices through various schemes, including 
laser-induced tunneling, Floquet engineering, 
and synthetic dimensions (9, 28-31). These 
schemes have been exploited to explore single- 
particle (32-35) and few-body (36, 37) phenomena, 
whereas the observation of strongly correlated 
many-body effects triggered by interactions 
has remained elusive. 

In this work, we report on the measurement 
of the Hall response in a quantum simulator 
with strongly interacting ultracold fermions. 
By controlling the repulsion between particles, 
we obtained experimental evidence of the uni- 
versal response that is predicted at a large 
interaction strength. We used a synthetic di- 
mension to engineer a two-leg ladder whose 
plaquettes are threaded by a synthetic mag- 


Hall polarization 
Py(t) 


B 
current 
Jet) 


4= momentum 
“* (time-of-flight) 


a oe atom number 
=< (Stern-Gerlach) 


Fig. 1. Experimental scheme. A synthetic ladder is 
realized by trapping fermionic ‘77Yb atoms in a 

1D optical lattice and coupling their nuclear spins 
me = -1/2 and mrp = -5/2 via two-photon Raman 
transitions. The position-dependent phase of the 
Raman coupling simulates a magnetic field B 

with Aharonov-Bohm phase @ per unit cell. An 
atomic current is activated by tilting the ladder with 
an optical gradient, equivalent to a constant electric 
field E,. The radius difference between the green 
and blue spheres illustrates the leg population 
imbalance (Hall polarization) induced by the Hall 
drift. The time-dependent longitudinal current J,(t) 
and the Hall polarization Py(t) are measured with 
time-of-flight imaging and optical Stern-Gerlach 
detection, respectively (typical acquisitions are 
shown below the ladder). 


netic flux ¢~ (Fig. 1). We monitored the real-time 
dynamics of the system after the instantaneous 
quench of a linear potential, which tilts the 
lattice along 2 and mimics the action of a lon- 
gitudinal electric field E,. We observe that the 
combined action of FE, and ¢ triggers a lon- 
gitudinal current J,, accompanied by the Hall < 
polarization of the system along the transverse ‘ 
direction P,. Even though the dynamics of J, 
and P, strongly depend on microscopic ladder 
parameters, we observe that a proxy of the . 
Hall coefficient, the Hall imbalance (27) 


(1) 


converges toward an interaction-independent 
value for large atomic repulsions. Our obser- 
vations quantitatively agree with theoretical 
calculations and confirm the predictions re- 
ported in (27). The results showcase the im- 
portance of interactions in Hall systems, paving 
the way toward the investigation of strongly 
correlated effects in topological phases of syn- 
thetic quantum matter. 


Making and probing a synthetic ladder 


Our experiment exploits an ultracold Fermi 
gas of '?Yb atoms initially polarized in the 
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|F = 5/2, mp = —5/2) hyperfine state. The 
atoms are trapped in a 1D optical lattice, which 
allows real tunneling between different sites 
along direction. An additional 2D lattice (not 
shown in Fig. 1) freezes the atomic motion 
along the orthogonal real-space directions, 
forming a 2D array of fermionic tubes. By 
adiabatically activating the coherent Raman 
couplings between nuclear spin states |mp = 
—1/2) and |mp = —5/2) (denoted m = 1, 2 
respectively), our system realizes a two-leg 
ladder (32), in which the nuclear spins act 
as different sites along a synthetic dimension 
y (see Fig. 1). The system is described by a 
two-leg version of the interacting Harper- 
Hofstadter Hamiltonian 


H= —ty> > [2 msi +h.c] 


jm 


-t>~ lea} a. + ne] + UD > mania 
Fi J 


(2) 


where G; m (@} m) is the fermionic annihilation 
(creation) operator on site (j, 7) in the real 
and synthetic (m = 1, 2) dimensions, 7; m = 
a Aan and h.c. is the Hermitian conjugate 
operator. Here, t, is the nearest-neighbor 
tunneling amplitude, and U > 0 is the “on- 
rung” interaction energy between two atoms 
with different nuclear spins in the same real- 
lattice site. The coupling between two spin 
states t,e’” is interpreted as a tunneling along 
the synthetic dimension, whereby the posi- 
tion-dependent phase simulates the effect of a 
static magnetic flux ~ threading the ladder; 
in our experiments, || = 0.327. A residual 
harmonic confining potential results in an 


additional term Heont, = Vi) | qi Mim, With 


the confinement strength V, = 0.01t,. The 
atomic repulsion U is controlled, independently 
from t, and t,, by changing the radial confine- 
ment of fermionic tubes via the 2D lattice depth; 
to keep V,, constant while changing U, we added 
a weak double-well potential along direction 2, 
compensating the trapping frequency by ad- 
justing the potential barrier between the two 
wells (38). 

To generate a current along #, we switched on 


an optical gradient A quench = -Ex>, ntti: 


tilting the ladder along the real-lattice direction, 
with E, = 0.5t,. After time t, we measured the 
particle current J, in the real dimension and the 
spin polarization P,, in the synthetic dimension. 
To perform these measurements, the Raman cou- 
pling was abruptly switched off to freeze the 
population along the synthetic dimension. The 
lattice momentum distribution in the real di- 
mension n(, t), normalized to the total atom 
number, was then measured with a band- 
mapping technique, where the lattice momenta 
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kare expressed in units of the real-lattice wave 
number /,, = n/d and d = 380 nm is the lattice 
spacing. We thus access the current J,, given by 


Jo(t) =| ysin(nk)n(ke,)dk (3) 


In the synthetic dimension, the spin distrib- 
ution is measured by performing an optical 
Stern-Gerlach detection (39). This method, 
based on the spin-dependent force exerted 
by a near-resonant laser beam, allows a spatial 
separation of the two spin components and 
the separate count of the atom number N,,, in 
both of them. The spin polarization coincides 
with the transverse (Hall) polarization P,, of 
the system, which we define as 


N,(t) = No (t) 
N,(t) +.No(t) 


Ni(0) — No(0) 
N,(0) + N2(0) 


P(t) = (4) 


This definition evaluates the difference in frac- 
tional spin population with respect to the ini- 
tial value, with the populations N,(0) and N.(0) 
measured right before the application of the 
optical gradient. The definition in Eq. 4 accounts 
for the small initial population difference caused 
by residual off-resonant coupling to the nuclear 
spin state |mp = +3/2) (88); this initial difference 
can safely be neglected owing to the averaging 
procedure discussed in the next section. We de- 
termined the Hall imbalance Ay; from the ratio 
between the measured P,, and J,, following Eq. 1. 


Measuring the Hall effect 


Figure 2 shows the measured current, polar- 
ization, and Hall imbalance as a function of 
time t (defined in units of f/t,, where f is the 
reduced Planck’s constant) for a particular 
choice of experimental parameters t, = 3.39t, 


Fig. 2. Time evolution of the particle current J,, 
Hall polarization Py, and Hall imbalance Ay. The 
experimental data are measured at dimensionless 


time t (in units of h/t,) for t, = 3.39t, and U = 6.56t,, 


after applying an instantaneous tilt E, = 0.5t,. 
The values of J, and Py (top plot; green and red, 
respectively) are evaluated by averaging two individual 


sets of measurements for ~ = +0.32n and = -0.32n, 


each comprising 10 to 15 images at every time 
step; the error bars represent the standard error of 
the mean and are obtained with a statistical 
Bootstrap method. The values of Ay (bottom plot; 
blue) are computed from the data in the top plot 
according to Eq. 1, and the error bars represent the 
standard error of the mean and are obtained with 
standard uncertainty propagation. The shaded areas 
are theoretical predictions accounting for the 
distribution of atom numbers in the tubes and 
experimental temperature uncertainty 1 < T/t, < 2. 
They result from a MFA (see main text), where the 


and U = 6.56t,. We performed identical mea- 
surements with both ~ = +0.32z and a re- 
versed direction of the synthetic magnetic field 
~ = —0.32n and observed a change of the sign 
in P,(t) (38). This behavior confirms the in- 
terpretation of our data in terms of the emer- 
gence of a Hall response. We averaged these 
two independent measurements of {J,(1), Py 
(t)} for p = +0.32n and {J,(t), —P,(t)} for @ = 
-0.32n to improve the signal-to-noise ratio 
and minimize the effect of the residual off- 
resonant coupling to the third state (38). We 
observed that the Hall imbalance Aj (Fig. 2, 
bottom) rapidly approaches a stationary re- 
gime, with small amplitude deviations around 
a limiting value, whereas the dynamical evo- 
lution of J, and P, remains transient. This fast 
convergence of Aj; is reproduced by the theo- 
retical model described in the next paragraph 
and conveniently allows us to measure the sta- 
tionary Hall response using quench dynamics. | 
According to the theoretical predictions re- 
ported in (27), the stationary Hall imbalance 
for strong interactions (U > t,) is expected to 
reach the U-independent universal value 


tan (5) | (5) 


ty 
hee a? 
* D) 


y 


Our simulator yields results consistent with 
this universal value (Fig. 2, bottom) despite 
important differences with the setup used in 
(27) (parabolic confinement, nonlinear drive, 
tubes with different particle occupations, and 
finite temperatures T ~ t,) and without using 
any fitting procedure. To explain this robustness, 
we provide a theoretical analysis. First, we per- 
formed extensive density matrix renormaliza- 
tion group (DMRG) (40) simulations at zero 


Jes tylty:Py 


1.5 9 
Universal Value 


oo 4 


0.5} x 
fat 
sg _o Exp Data 
0.0; : 5 
0 1 2 3 4 5 
T 


renormalized tunneling t; = St, is evaluated through comparison with zero-temperature DMRG. The 
parameter fy is introduced to allow a meaningful comparison between the MFA and experiment. The gray 


dashed line indicates the universal relation Eq. 5. 
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Fig. 3. Time-averaged 1.0 


Hall imbalance as a 4 


function of synthetic 
tunneling. (A to C) 
Single-particle energy 


x 
0.5 x 
w 


A ty/ty = 0.45 


ty/t, = 1.16 Cc 


spectrum e(k) calcu- 0.0 
lated as a function 


IVA 


Wy 


of the quasimomentum 
k for different values of 
t/t, [labeled as A, B, 
and C in (D)]. The 
interband gap increases 
with t,/t,, eventually 
leading to two separate 
bands (C). The color 
scale represents the 
population of the m = 1 
state. (D) The experi- 
mental data (blue 
circles) are measured 
at U = 6.56t, and 


i~] 


tylt, (An) 


¢e| = 0.32n, with the 
averaging procedure 
and error analysis 
detailed in Fig. 2. The 


horizontal and vertical error bars show the experimental uncertainty in t, and the uncertainty resulting from 


the time average, respectively. The red solid line is the DMRG simulation at zero temperature for a fixed 


atomic interaction U = 6.56t, and a number of rungs L = 


200, accounting for different tube occupations. The 


yellow solid line is the MFA at zero temperature with renormalized ty = t, + 0.1U (see main text for details). 
The shaded area illustrates the MFA at finite temperatures 0.5 < T/t, < 2, and the gray dashed line indicates 


the universal relation from Eq. 5. 


Fig. 4. Time-averaged Hall imbalance 
as a function of atomic interaction. 
The experimental data (blue circles) 
are measured at t, = 1.15t, and 

¢| = 0.32n, with the averaging 
procedure and error analysis detailed 
in Fig. 2. The horizontal and vertical 
error bars show the experimental 
uncertainty in U and the uncertainty 
esulting from the time average, 


1.57 


tylt, (An) 


0.5} 


0.0; 


Universal Value 


Me Maes 
'o Exp Data i 
ad A(T=0), 


espectively. The red solid line is the 
DMRG simulation at zero temperature 
for a fixed synthetic tunneling ty = 
.15t, and L = 200 rungs, accounting 


for different tube occupations. The yellow solid line is the MFA at zero temperature, with the substitution 
t, >t, = t, + 0.30- U. The shaded area illustrates MFA results for finite temperatures 0.5 < T/t, < 2. The 


gray dashed line indicates the universal relation from Eq. 5, and the dot-dashed line depicts the result for 


noninteracting fermions at zero temperature. 


temperature (finite-temperature DMRG would 
be prohibitively costly). To give a semiquanti- 
tative account of the effects of finite tempera- 
tures in the nonuniversal regime (intermediate 
U and t,), we resorted to a mean-field approxi- 
mation (MFA) of interactions, which results in 
an effective increase of the transverse coupling 
ty — t;. For each value of U, we first find the ¢, 
that best reproduces the zero-temperature 
DMRG real-time simulations of the current 
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J, and polarization P,. We find a quantitative 
agreement between MFA and DMRG, if the 
MFA polarization is multiplied by uy /ty; no 
rescaling is required for J,. We discuss the clear 
limitations of MFA to describe the dynamics 
of such strongly correlated low-dimensional 
systems (38). Nonetheless, this approach gives 
a reasonable explanation for why, at finite 
temperatures, larger values of the transverse 
hopping t, or the interaction U are required 


to observe the universal Hall response (Eq. 5), 
as observed and explained below. 


Testing universality 


To pinpoint the onset of the universal regime, 
we measured the dependence of the Hall im- 
balance’s stationary value on the system pa- 
rameters. We considered the averaged Hall 
ratio (Ay) = (P,(t) /Jz(t)), in the time inter- 
val t € [1, 5]. Figure 3D shows the measured 
(Au) for a fixed interaction strength U = 6.56t, 
and different values of the tunneling ratio 
t,/tz, Which is controlled by changing the 
Raman beam power. The averaged Hall im- 
balance (Ay) is small at small synthetic tun- 
neling (¢, <t¢,) and reaches the universal 
value (Eq. 5) for ty/t, 2 2. Provided that the 
system is below half-filling, this transition 
also exists in noninteracting systems, where 
a large transverse hopping ¢, opens a large 
gap between the two bands of the system, 
stabilizing a single-band metal whose Hall im- 
balance has the universal value (Eq. 5) (27, 38) 
(see single-particle energy bands in Fig. 3, A to 
C). Because finite temperatures tend to pro- 
mote particles to the upper band, we expect 
the finite temperature (T ~ ¢,) in our setup 
to push the transition to the universal regime 
toward larger values of t,/t,. This effect is 
visible in Fig. 3: Whereas zero-temperature 
DMRG predicts a transition to the universal 
regime at smaller values of t,/t,, the finite- 
temperature MFA yields a better quantitative 
agreement with the experiment. 

Finally, we demonstrate the interaction- 
driven origin of the universal Hall response 
of Eq. 5. Figure 4 illustrates the behavior of 
the Hall imbalance (Ay) upon changing the 
interaction strength U/t, at a fixed, nearly 
isotropic tunneling ¢, = 1.15¢,. We observe that 
(Ay) quickly deviates from the noninteracting 
value and approaches the U-independent uni- 
versal value 2(t,/t,)|tan(@/2)| ~1.1 at large 
U/t,. In the spirit of the MFA, this behavior 
can be partially explained by the two-band sce- 
nario discussed before: Interactions renor- 
malize ¢, toward an interaction-dependent 
value ty > ty, enlarging the gap between the 
bands. In the large-U limit, the bands are 
well separated and the highest band becomes 
empty, similarly to what happens in the large- 
t, limit (Fig. 3, A to C). Increasing U thus leads 
to a robust single-band metallic state charac- 
terized by a constant, universal value of Ay 
(27). As discussed above, the MFA accounts for 
finite-temperature effects and permits a quan- 
titative comparison between experiment and 
theory. Similar to the transition from weak 
to large transverse tunneling described in the 
previous paragraph, the finite temperature 
pushes the transition toward larger interaction 
strengths than those predicted by the zero- 
temperature DMRG, as observed in the exper- 
imental data. 
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Despite the effectiveness of the MFA pic- 
ture, we stress the essential and nonperturba- 
tive role played by strong interactions to reach 
the universal regime in Fig. 4, which funda- 
mentally differentiates it from the large-t,, limit 
(Fig. 3). Indeed, although strong interactions 
can be modeled by the MFA using a renor- 
malized ty the universal regime is reached 
anyway for Ay = 2(t,/ty)|tan(@/2)|, and not 


for Ay = 2(ts/t}) |tan(@/2)|. We emphasize 


that the observed effect is truly a many-body 
effect, as captured by the DMRG calculations. 
The MFA allows us to make progress at finite 
temperatures because it reproduces the sta- 
bilization of the single-band metal at large U 
by increasing t,, where the Hall imbalance 
approaches a universal value when t, > T (38). 
Nonetheless, MFA requires its results to be 
rescaled to give a quantitative account of the 
universal regime and has clear limitations with 
respect to reproducing the fine dynamics of the 
polarization at large U (38). The MFA should 
thus be complemented with more complete, 
but much more difficult, exact finite-temperature 
studies. 


Conclusions 


In this experiment, we have shown distinctive 
many-body effects triggered by strong interac- 
tions in the Hall response of a controllable 
quantum simulator of a two-leg ladder threaded 
by a synthetic magnetic flux. Beyond the clear 
potential of such experiments to measure Hall 
voltages (47) and clarify the exotic Hall response 
of strongly correlated solid-state conductors, 
our work paves the way to the investigation 
of the exotic transport properties of strongly 
correlated topological phases of matter. This 
cold-atom experiment enters unknown ter- 
ritory for theory because it features strong 
correlations and finite temperatures and yet 
shows full control of the simulation param- 
eters. An interesting perspective resides in 
investigating interacting ladders with a larger 
number of nuclear spin states, a regime no- 
toriously difficult to access with present com- 
putational techniques. 
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From language development to language evolution: 
A unified view of human lexical creativity 


Thomas Brochhagen’*, Gemma Boleda™”, Eleonora Gualdoni’, Yang Xu* 


A defining property of human language is the creative use of words to express multiple meanings through word 
meaning extension. Such lexical creativity is manifested at different timescales, ranging from language 
development in children to the evolution of word meanings over history. We explored whether different 
manifestations of lexical creativity build on a common foundation. Using computational models, we show that a 
parsimonious set of semantic knowledge types characterize developmental data as well as evolutionary 
products of meaning extension spanning over 1400 languages. Models for evolutionary data account very well 
for developmental data, and vice versa. These findings suggest a unified foundation for human lexical creativity 
underlying both the fleeting products of individual ontogeny and the evolutionary products of phylogeny 


across languages. 


umans often need to talk about new 

entities and concepts but must rely on a 

limited vocabulary of known words to 

do so. Acommon solution to overcoming 

this bottleneck is the creative use of 
single words to express multiple meanings 
through a process known as word meaning 
extension (/-3). 

In linguistic development, Vygotsky, among 
others, documented word meaning extension 
in young children, noting that a word such as 
“quah” can be overextended to express “a duck,” 
“water,” “liquid,” or “a coin with an eagle on it” 
(4-6). At the individual level, child overexten- 
sion is transient: It occurs during the early 
stages of life and vanishes in later language 
development (7) (the definition of overexten- 
sion and other key terms can be found in 
Table 1). By contrast, at the population level 
in language evolution, more stable forms 
of lexical creativity become entrenched in 
language after longer periods of time because 
of cultural transmission. Colexification—the 
phenomenon by which related meanings (such 
as “finger” and “toe”) are expressed with the 
same word (e.g., “dit” in Catalan)—can be a 
product of this process and is attested across 
languages (8-11). Similarly, words can also ac- 
quire new meanings over time through se- 
mantic change (J, 12-14). For example, the 
word “mouse” was extended to refer to a com- 
puter device because of the visual similarity 
between the device and a rodent. 

Despite these differences in the manifesta- 
tions of lexical creativity across levels (individual 
versus population) and timescales (short versus 
long), we posit that children and language users 
in general tackle the same fundamental task: to 
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extend known words to referents that lack a 
word by relating those referents to the known 
words’ current meanings. If this is true, we 
expect the kinds of overextensions that appear 
in child development to be similar to the kinds 
of meaning extensions attested in language 
evolution because both build on a common 
foundation grounded in human experience 
and cognition. We developed a computational 
framework to test this idea and have shown 
that it receives support in a large-scale analysis. 

Developmental and evolutionary pheno- 
mena pertaining to lexical creativity have been 
studied by different research communities. 
Research in developmental psychology sug- 
gests that child overextension relies on the 
ability to identify similarity relations among 
concepts. This ability has been shown to draw 
on multiple types of semantic knowledge by 


q 


using perceptual, action-functional, affec He 
and contextual information (6, 15). SimilL— 4 
recent work has demonstrated that visual, tax- 
onomic, and associative information can joint- 
ly explain a variety of child overextension 
patterns (15). Work in linguistics and cognitive 
psychology has similarly suggested that se- 
mantic relatedness plays a role in colexification 
(10, 11, 16-18) and historical semantic change 
C, 3, 13, 19, 20). In this work, we integrated 
these separate lines of research by asking 
whether different forms of lexical creativity, 
from that of childhood to language evolution, 
rely on a common foundation. 

Our study is relevant to the long-standing 
issue concerning the relation between linguistic 
development and the evolution of language. 
One locus of this issue is the possibility that 
ontogeny recapitulates phylogeny in language 
(21, 22); that is, whether there are recurring 
patterns in child development that inform or | 
reflect patterns in language evolution or vice 
versa. Previous work has suggested that ontog- 
eny is shaped by a particular language, empha- 
sizing learning at the individual level, whereas 
phylogeny is a product of lasting innovations, 
emphasizing language use at the population 
level (23). By contrast, recent studies have shown ‘ 
that cross-linguistic patterns can recur during 
individual language learning (24-26), that 
cultural evolution of linguistic structures can 
be recapitulated in the laboratory (27), and that ‘ 
some development data predict lexical evolu- 
tion rates (28). 

We investigated the relation between lin- 
guistic development at the individual level and 
language evolution at the population level by 
taking a different perspective informed by word 


Table 1. Key terms and their definitions as operationalized in this study. 


Term Definition 


Affectiveness 


Measure of how pleasant, intense, and dominant a term is perceived 


to be (e.g., “sunshine” scores high on all three dimensions). 


Measure of how relatable two terms are 


(e.g., “key” and “door” are more closely associated than either is to “dog’). 


Meanings expressed by the same word are colexified 
(e.g., Catalan uses a single word, “dit,” to express both “finger” and “toe”). 


An individual's developmental history 


(e.g., their language use from early acquisition to adulthood). 


Extended use of a known term to referents outside its normal category 
(e.g., a child saying “apple” for any round object or “dog” for any animal). _ 


Visual similarity 


Meaning, or the study thereof. 


Historical change in word meaning 
(e.g., “mouse” taking on the new meaning “computer device”). 


Measure of resemblance based on visual features 


(e.g., how similar apples and balls look across many images). 
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meaning extension. There is only a small subset 
of meaning extensions that directly overlap be- 
tween child development and products of lan- 
guage evolution (supplementary materials) 
(Fig. 1A). This might be taken as weak evidence 
for the idea that ontogeny at least partially 
recapitulates phylogeny or vice versa. Our ap- 
proach aims at understanding the nonover- 
lapping, broader space of creative meaning 
extensions that is illustrated in Fig. 1A: not 
only the intersection, but the union of these 
phenomena. 

We present a unified view of lexical creativity 
by hypothesizing that there is a latent common 
foundation that children and language users in 
general both build on when using words cre- 
atively. This common foundation relies on two 
components: first, grounded knowledge about 
objects, events, properties, and relations, such 
as objects having certain shapes or belonging 
to certain categories; and second, the use of 
this knowledge to link referents lacking a 
word with current meanings of known words 
based on similarities between the two. We 
thus tested the proposal that both the types of 
knowledge and the use of similarity in word 
meaning extension are shared in child overex- 
tension and products of lexical creativity from 
language users in general. 

Figure 1B summarizes the computational 
framework we developed to test our proposal, 
which involves (i) explicitly defining a set of 
semantic knowledge types as proxies for the 
hypothesized common foundation and (ii) 
using them to make cross-predictions about 
products of lexical creativity in development 
and evolution. If the different forms of lexical 
creativity draw on different knowledge types 
or do so to very different degrees, we would 
expect minimal carryover between the pheno- 
mena. If instead our unified view is warranted, 
we expect good cross-predictability; that is, we 
expect that models built from child data will 
successfully account for data that are the prod- 
uct of language evolution and vice versa. 


Framework 


We developed a framework that incorporates 
four semantic knowledge types discussed 
commonly in the literature in connection to 
child overextension, colexification, and seman- 
tic change (6, 10, 11, 15, 18, 29, 30): associativity, 
and similarities based on visual, taxonomic, 
and affective information (Fig. 1B). These are 
largely complementary in the information they 
provide (tables S1 to S3), but they are not ex- 
haustive. We operationalized these knowledge 
types on the basis of English resources, a lim- 
itation to which we returned later. Materials 
and methods are available as supplementary 
materials. 

We operationalized visual similarity using 
computationally derived visual representations 
of meanings. We followed a two-step proce- 
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Fig. 1. Illustrations of the phenomenon of lexical creativity and the overall framework. (A) Examples 
of word meaning extension as a common form of lexical creativity. Each is an attested pair of meanings that 
are coexpressed by a word form. The circle’s intersection shows examples of child overextensions (in 
development) that are also attested in lexicons of the world's languages (through evolution). Cases outside of 
this area are only attested at one timescale. (B) Framework for investigating the possibility of a common 
foundation in lexical creativity. Four semantic knowledge types are considered: visual similarity, associativity, 
taxonomy, and affectiveness. The framework enables cross-prediction between developmental and 


evolutionary phenomena. 


dure, drawing on existing work (15, 31). First, 
we used a computer vision model (32) to pro- 
duce representations for images of instances 
of meanings (e.g., for images of dogs for the 
meaning “dog”). Second, we averaged these 
instance representations, yielding average visual 
representations that we took as surrogate 
meaning representations. We used these aver- 
age representations to calculate the visual simi- 
larity of different meanings. 

We defined associativity in terms of how 
closely meanings are relatable in semantic 
memory. We quantified this using large-scale 
experimental data (29) that records the re- 
sponses produced by subjects when prompted 
with a cue word (e.g., “dog” may elicit “cat,” 
“bone,” or “cuddly”). To obtain a measure of 
associativity, we transformed cue-response 
counts using the best method identified in the 
literature so far (11, 29). 
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We took taxonomic similarity as a proxy for 
the categorical relatedness of meanings (e.g., . 
“dog” and “cat” are taxonomically closer than 
either is to “key” or “love”). Following previous 
work on child overextension (15), we used a 
measure based on a large lexical database (33). 
The measure yields a score for the similarity of 
two meanings based on their closest common 
ancestor in a taxonomy. 

We operationalized affectiveness as the sim- 
ilarity of affective experiential features such as 
emotional valence. More precisely, following 
(30), we quantified affective similarity between 
meanings as the cosine similarity of their 
vectors of ratings, built from two large-scale 
databases of affectiveness norms (34, 35). 
These norms encompass ratings for valence, 
arousal, and dominance. 

We analyzed three independent datasets 
that represent three phenomena of lexical 
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A Child overextension 


2.0 
1.5 


1.0 


Estimate 


Affectiveness Associativity Taxonomy 


C Semantic change 


Estimate 


Affectiveness Associativity Taxonomy 


Fig. 2. Summary of main results. (A to C) Standardized estimates of the 
effect of knowledge types from best models of child overextension, colexification, 
and semantic change, respectively. (D) Accuracy of models when predicting 
new data (i.e., the fraction of correct predictions they make). Self-predictions 
(e.g., the colexification model's performance on colexification data) 


creativity. The first includes 254 cases of 
overextension reported in English-speaking 
children (15), the most comprehensive collec- 
tion available. We focused on English because 
overextension data from other languages is 
sparse and not suitable for a scalable analysis. 
The second data set draws on the Database of 
Cross-Linguistic Colexifications (CLICS) (36), 
the largest resource for colexification. We worked 
with 22,379 attested colexification cases from 
1486 languages. Accompanying CLICS on the 
longer evolutionary timescale, we also analyzed 
a third dataset, DatSemShift (37). This is the 
largest resource of historical semantic change, 
covering 1792 attested cases of semantic change 
from 516 languages. 

For modeling purposes, we balanced the 
data to include an equal number of positive 
and negative cases. Positive cases exhausted 
the attested pairs in each dataset after pre- 
processing. We randomly sampled negative 
cases from pairings of attested meanings that 
result in unattested combinations. Following 
this procedure, the task of the models is to use 
one or multiple semantic knowledge types 
(visual similarity, associativity, taxonomic 
similarity, or affectiveness) to characterize each 
phenomenon by contrast to a backdrop of 
negative cases (JO, 11, 15). 

To test our proposal, we first identified the 
model that best characterizes each of the three 
phenomena in isolation and then tested each 
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of these models on data from the other pheno- 
mena. For model selection, we fit several 
logistic regression models predicting whether 
a pair of meanings colexifies in a language, 
participates in semantic change in a language, 
or appears in overextension in English. For 
each phenomenon, the only parameters that 
vary across their models are the knowledge 
types that they have as predictors. We consid- 
ered all possible combinations: from four uni- 
variate models per phenomenon (one for each 
knowledge type) to models with two, three, or 
all four knowledge types. Colexification and 
semantic change models have language and 
geographical region as population-level effects. 
For model comparison and validation, we used 
approximate leave-one-out cross-validation. 
Our measure for model selection is expected 
log predictive density (38). 


Results 
Types of semantic knowledge 


Figure 2, A to C, shows the standardized es- 
timates from the best model for each phenome- 
non. Details and model rankings are provided 
in the supplementary materials. These results 
generalize previous findings that analyze each 
phenomenon separately (3, 10, 11, 15) in several 
ways. First, the results show that across phe- 
nomena, a word is more likely to be creatively 
extended when its meaning shares properties 
with that of a new target referent. All coefficients 
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provide an upper bound for cross-predictions. The random baseline of 

0.5 (dashed line) provides a lower bound. Ceiling and bottom predictive 
accuracy are 1.0 and 0, respectively. The best models for evolutionary data 
{(B) to (D)] only use 3 predictors; we include a bar for affectiveness at O for 


are positive, meaning that the higher the se- 
mantic relatedness between two meanings, the 
higher the likelihood that they will be con- 
nected through lexical creativity. When mean- 
ings are similar along several dimensions, the 
likelihood that lexical creativity connects them 
grows. Second, the results suggest that the 
semantic properties that anchor lexical creativ- 
ity are of diverse types. All of the best models 
draw on multiple knowledge types: for devel- 
opmental data (Fig. 2A), all four of them, and 
for evolutionary data (Fig. 2, B and C), all mo- 
dalities but affectiveness. A third finding is the 
similarity in the ranking of the coefficients, 
with associativity being the highest. 

There are also differences between the models 
(Fig. 2, A to C). The most salient one is that 
associativity is more predictive of colexifica- 
tion and semantic change than of overextension. 
Taxonomy and visual similarity are more pre- 
dictive of overextension than of the other two 
phenomena, and overextension factors in affect- 
iveness, unlike the other two. These differences 
are partly reflected in the literature on child 
overextension, in which a prominently docu- 
mented type of overextensional error is viola- 
tion of taxonomic constraints—e.g., the use of 
words to describe referents from higher-order 
taxonomic categories (6). However, these dif- 
ferences might also be attributed to the resources 
we use as proxies for knowledge types. These 
resources are based on adult language use and 
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may thus account less well for children’s data 
[e.g., in word association, adult speakers are 
more likely to associate concepts based on 
situation as opposed to taxonomy (39)]. 


Cross-prediction 


We next show that lexical creativity builds on 
a common foundation more directly, by per- 
forming a cross-predictive analysis. Specifically, 
we evaluated models for one phenomenon 
(e.g., overextension) on how well they account 
for data from a different phenomenon (e.g., 
colexification). We termed this “cross-prediction,” 
and contrast it with “self-prediction,” that is, 
prediction for unseen data from the same phe- 
nomenon (e.g., the overextension model being 
applied to unseen overextension data). To rule 
out any carryover owing to pairs that appear 
in multiple datasets [e.g., “moon” and “sun,” 
a meaning pair that colexifies in some lan- 
guages and children have linked through over- 
extension (the intersection is illustrated in 
Fig. 1A)], we excluded all pairs that appear in 
more than one dataset. This makes the task 
harder but ensures that the models’ perform- 
ance reflects their capabilities to characterize 
truly out-of-sample data. Details can be found 
in the supplementary materials. 

Figure 2D reports self- and cross-predictive 
accuracies of the models. Cross-prediction is 
very successful, even when compared with self- 
prediction. There is good carryover not only 
among the longer timescale phenomena but 
also between developmental and evolutionary 
phenomena. In all cases, the difference in ac- 
curacy between self- and cross-prediction is very 
small (between 0 and 0.03), and the difference 
to the baseline is large (0.22 to 0.31). 

Because the models differ in their coeffi- 
cients (Fig. 2, A to C), it could be that the self- 
and cross-predictions yield similar accuracies 
but make quite different predictions. This is 
not the case (Fig. 3): Self- and cross-predictions 
are well aligned throughout. 


Robustness checks 


Our study builds on English resources to derive 
proxies for semantic knowledge owing to a lack 
of comparable large-scale resources in other 
languages. This may not fully capture variation 
that is culture or language specific (77). Moreover, 
building on English resources could introduce 
English-specific biases; for example, a bias could 
be introduced because the overextension data 
are from English-speaking children. To ensure 
that such biases do not drive our findings, we 
performed a series of robustness checks (sup- 
plementary materials). 

First, we reevaluated the models’ cross-predictive 
abilities on data that exclude Indo-European 
languages, the language family to which English 
belongs. This exclusion only concerns colexifica- 
tion and semantic change data because the 
overextension data are based on English speak- 
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A Data: Child overextension 
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C Data: Semantic change 
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Fig. 3. Comparisons of model self- and cross-predictions based on data from child overextension, 
colexification, and semantic change. (A to C) Comparsions of self-predictions made by the best model 


of a phenomenon (y axis) against cross-predictions made by the best model for another phenomenon (x axis) 
for data from child overextension (A), colexification (B), and semantic change (C). Data points are attested 
cases of each phenomenon (a counterpart for unattested cases can be seen in fig. S3). Colors and shapes 
separate predictions into classes: “Right/Right” indicates correct predictions by both the self- and cross- 

predicting models, “Wrong/Wrong” indicates incorrect predictions by both, “Right/Wrong” indicates a correct 
prediction from the self-predicting model but an incorrect one from the cross-predicting one, and conversely 


for “Wrong/Right.” To make plots legible, colexification data were randomly subsampled to 8%. 


ers. If cross-prediction results were driven by 
an English bias, we would expect the models’ 
predictive capabilities to decrease when tested 
on non-Indo-European data only. In this regard, 
our results are robust (Fig. 4A). 
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Next, we performed checks by manipulating 
the data on which the models are fit. We refit 
the best colexification and semantic change 
models, leaving out one at a time each of the five 
major language families found within the data. 
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Fig. 4. Robustness checks on cross-prediction against geographic or phylogenetic biases. Red 

dashed lines indicate accuracy of best models on original data (Fig. 2D). (A) Accuracy of best models when 
evaluated on data that excludes Indo-European languages. (B and ©) Accuracy of colexification (B) and 
semantic change (C) models when refit either excluding data from one of their five largest language families 
(left) or without data from one of their six macro regions (right). 


The same leave-one-out refitting process was 
conducted for all large geographical regions. 
The results are stable (Fig. 4, B and C) 

Last, we also redid our analyses using alter- 
native visual representations obtained from a 
model trained on a nonlinguistic task. When 
we used these representations as well, our 
results were stable (tables S15 to S20 and 
figure S1). 


Discussion 


Our findings suggest a shared human capacity 
to creatively extend words to new meanings 
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across timescales and the individual and pop- 
ulation level. We argue that this capacity relies 
on a common foundation of knowledge, with 
different facets of semantic relatedness that 
enable new meaning extensions. 

Although our results indicate that diverse 
manifestations of lexical creativity are related 
and share common ground, the current study 
cannot speak directly to the nature of this re- 
lationship. Our findings are compatible with 
at least two different explanations. The first ex- 
planation is a direct causal pathway, with child 
overextensions being adopted directly by lin- 
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guistic communities, hence explaining their 
resemblance to products of language evolution. 
However, we believe that this account is 
implausible for several reasons. First, it is 
unlikely for children’s spontaneous innova- 
tions to be regularly adopted by broader adult 
populations or in language change (23, 40), 
and more so to a degree that leaves a cross- 
linguistic signature. Second, this account would 
leave the attested nonintersecting cases of co- 
lexification and semantic change in Fig. 1A 
unexplained. Some of these meanings are 
encountered relatively late in language acqui- 
sition, making them less likely to appear in 
overextension. Third, functional pressures toward 
efficient communication shape word mean- 
ings across languages (11, 47-44). These pressures 
are independent of child overextension and 
suggest that phenomena such as colexification 
are partially shaped by a need to distinguish 
meanings that appear in similar contexts. This 
may explain why child overextensions such as 
“baby” for “adult” or “bus” for “train” are rarely 
expressed by a single word across languages: 
Doing so may cause ambiguity that is hard to 
resolve even in context (11, 44, 45). A second 
explanation, which we suggest to be more 
likely, is an indirect relationship, in which prod- 
ucts of lexical creativity stem from a common 
latent source of multifaceted semantic knowl- 
edge (Fig. 1B). That is, children draw from this 
source for overextension, and adults do so as 
well when extending meaning in new ways. 
Some instances of creative lexical uses by adults 
(e.g., the metaphorical extension of “mouse” to 
the computer device) are then adopted by their 
linguistic communities over time, making their 
way into the lexicon. 

We have shown that the products of lexical 
creativity of young learners and language 
users in general can both be explained by a 
single latent common ground. Our work iden- 
tifies a foundation of shared knowledge for 
this common ground, extending prior research 
that suggests that words tend to express related 
meanings (10, 17) owing to cognitive advantages 
for learning, retrieving, and interpreting words 
(3, 46-48). Additionally, more generally, the use 
of the same word for multiple meanings allows 
for more compressible lexicons (49) and for 
the reuse of shorter words that are easier to 
produce (45, 50). Future work should further 
specify the origins of this common founda- 
tion and the cognitive mechanisms of human 
lexical creativity. 
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In vivo hematopoietic stem cell modification by 


mRNA delivery 
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Hematopoietic stem cells (HSCs) are the source of all blood cells over an individual's lifetime. 
Diseased HSCs can be replaced with gene-engineered or healthy HSCs through HSC transplantation 
(HSCT). However, current protocols carry major side effects and have limited access. We 
developed CD117/LNP—messenger RNA (mRNA), a lipid nanoparticle (LNP) that encapsulates 
mRNA and is targeted to the stem cell factor receptor (CD117) on HSCs. Delivery of the anti-human 
CD117/LNP-based editing system yielded near-complete correction of hematopoietic sickle cells. 
Furthermore, in vivo delivery of pro-apoptotic PUMA (p53 up-regulated modulator of apoptosis) 
mRNA with CD117/LNP affected HSC function and permitted nongenotoxic conditioning for HSCT. 
The ability to target HSCs in vivo offers a nongenotoxic conditioning regimen for HSCT, and 

this platform could be the basis of in vivo genome editing to cure genetic disorders, which would 


abrogate the need for HSCT. 


ematopoietic stem cells (HSCs) reside in 

the bone marrow (BM), where they divide 

throughout life to produce all cells of the 

blood and immune system through their 

self-renewal ability. Their multipotency 
enables the formation of myeloid (erythroid, 
megakaryocytic, and myeloid-immune) and 
lymphoid cell progenitors. HSC transplanta- 
tion (HSCT), which replaces diseased HSCs with 
healthy ones, can be a curative treatment for 
nonmalignant hematopoietic disorders, such 
as hemoglobinopathies and immunodeficiencies. 
Nonmalignant hematopoietic disorders can be 
cured with allogeneic HSCT (in which the HSC 
source is obtained from a sibling, parent, or 
unrelated donor), but only a fraction of patients 
have a suitable immunologic match to mini- 
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mize the potentially fatal complication of graft- 
versus-host disease (GVHD). Gene therapy can 
eliminate the risk of GVHD and correct non- 
malignant hematopoietic disorders by using 
autologous HSCs (in which the HSC are ob- 
tained from the actual patient) and replace 
the genetic defect either through gene addition 
or editing. Current hematopoietic gene therapy 
requires isolation of HSCs from the patient and 
ex vivo lentiviral transduction for gene addition 
or electroporation with purified reagents for 
genome editing. A “conditioning” regimen, 
such as chemotherapy or radiation, is used to 
eliminate the patient’s own HSCs. This makes 
space in the BM niche to allow the engraftment 
of infused allogeneic donor or genetically modi- 
fied autologous HSCs. The conditioning proce- 
dure carries substantial acute and chronic 
systemic toxicities, including infertility and 
secondary malignancies due to accumulated 
DNA damage. Additionally, some nonmalignant 
hematopoietic disorders are due to DNA repair 
pathway mutations, such as radiosensitive severe 
combined immunodeficiency (SCID) or Fanconi 
anemia. These patients do not tolerate existing 
conditioning because of excessive toxicity with 
alkylating chemotherapy or radiation, as well 
as increased rates of malignancy long term. 
Therefore, we sought to address two major 
challenges by developing a flexible methodol- 
ogy that can modify HSCs in vivo and sepa- 
rately establish a nongenotoxic conditioning 
method. Here, we describe an HSC-targeted 
lipid nanoparticle (LNP) that encapsulates 
mRNA and uses antibodies against CD117 
conjugated to LNP (CD117/LNP-mRNA). HSCs 
are dependent on stromal-derived factors, in- 
cluding stem cell factor (SCF), which binds to 
the receptor c-Kit (CD117). CD117 is expressed 
on both short- and long-term HSCs and some 


hypothesize may facilitate or augment LNP 
internalization (2). Nucleoside-modified and 
purified mRNA is nonimmunogenic, stable, 
and extensible and can be used to express vir- 
tually any protein of interest (3-5). LNPs are 
thus far the most promising delivery system to 
fulfill the therapeutic potential of mRNA (6, 7). 
These LNPs contain ionizable lipids (positively 
charged at pH < 6.4), which aid in packaging 
the mRNA and endosomal escape. Such LNPs 
were first approved in 2018 for small inter- 
fering RNA (8) but became widely used in 
2020 because of the LNP-mRNA platform for 
the Moderna and Pfizer COVID-19 vaccines. 
The LNP-mRNA in these US Food and Drug 
Administration-approved vaccines drives anti- 
gen expression but does not actively target 
specific cells or organs. By decorating the 
surface of LNPs with targeting moieties, we 
have demonstrated effective targeting to spe- 
cific cell types, such as endothelial cells and 
T cells, with therapeutic efficacy upon single 
intravenous injection in mice, as described in 
our previous reports (9-11). In this work, we 
used nucleoside-modified mRNA that encoded 
cyclic AMP response element (Cre) recombi- 
nase, a CRISPR-Cas9 adenine base editor fusion 
gene, or the pro-apoptotic BH3-only gene PUMA 
(p53 up-regulated modulator of apoptosis) in 
CD117/LNP-mRNA to genetically alter HSCs, cor- 
rect a disease mutation, or deplete HSCs through 
nongenotoxic conditioning, respectively. Prior 
studies have shown that HSC depletion through 
immunotoxins or radioimmunotherapy can be 
performed as conditioning for HSCT (22, 73), 
but they can only serve as platforms for HSC 
depletion rather than delivering other cargos. 
Our proof-of-principle data reveal an innova- 
tive and flexible approach to target HSCs in 
vivo, which may pave the way to modify HSC 
behavior and correct genetic mutations by 
delivering targeted mRNA-based therapeutics 
capable of genome engineering. 


Results 

Anti-CD117 LNPs efficiently target BM cells 

in vitro 

We first incubated C57BL/6 lineage-depleted 
(Lin’) BM cells or whole bone marrow (WBM) 
in vitro with either unconjugated LNPs that en- 
capsulated 0.1, 1, or 3 ug of nucleoside-modified 
luciferase mRNA (unmodified LNP-Luc), anti- 
CD45-conjugated LNPs (CD45/LNP-Luc), anti- 
CD117-conjugated LNP (CD117/LNP-Luc), or 
isotype control immunoglobulin G (IgG)- 
conjugated LNPs (control IgG/LNP-Luc). 
CD45/LNP and CD117/LNP were hypothesized 
to bind all hematopoietic-derived cells or stem 
and progenitor cells, respectively. Control IgG/ 
LNP and unconjugated LNP were used as con- 
trols. The highest levels of luciferase activity 
in WBM were detected with CD117/LNP-Luc 
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Fig. 1. In vitro targeting of WBM or hematopoietic progenitors (Lin) cells 
incubated with LNPs that encapsulate luciferase (CD117/LNP-Luc) or 

Cre recombinase (CD117/LNP-Cre) mRNAs. (A) Luciferase activity normalized 
by total protein in WBM cells incubated with varying doses (indicated on x axis) 
of targeted or control LNP-Luc for 18 hours in vitro. Data indicate mean + SD 
of n = 3 replicate experiments. P values are from Dunnett's multiple comparison 
test after two-way analysis of variance (ANOVA). ****P < 0.0001. (B) LNP-Luc 
treatment of Lin’ BM (n = 3). Data indicate mean + SD of n = 3 replicate 
experiments. P values are from Dunnett's multiple comparison test after two-way 


(Fig. 1A). Luciferase activity was further in- 
creased when Lin’ cells were treated with CD117/ 
LNP-Luc (Fig. 1B). Increased activity of CD117/ 
LNP-Luc in Lin” cells was consistent with a 
23-fold increase in the proportion of CD117* 
in Lin’-selected cells (2.8% CD117* in WBM 
cells versus 65% CD117* in Lin’ cells). CD117/ 
LNP luciferase activity was 500- to 700-fold 
greater than CD45/LNP luciferase activity in 
WBM depending on dose, when normalized 
to the frequency of CD45- and CD117-positive 
cells in WBM (fig. S1A). Normalized luciferase 
activity suggests that CD117-mediated target- 
ing and delivery is superior to CD45-mediated 
targeting in vitro. This demonstrates efficient 
targeting and functional delivery of mRNA 
with CD117/LNP. 

CD117/LNP that encapsulated Cre recombi- 
nase mRNA (CD117/LNP-Cre) was used to test 
LNP-mediated genetic recombination in HSCs 
and persistence of the editing in conjunction 
with three reporter mouse models. These 
mouse models (Ai6, Ai9, and Ail4) are engi- 
neered with a Cre-responsive reporter allele 
comprised of a loxP-flanked STOP cassette 
preventing transcription of a CAG promoter- 
driven green or red fluorescent reporter gene 


(ZsGreen for Ai6 and tdTomato for Ai9 and 
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Ail4, respectively) inserted into the Gt(ROSA) 
26Sor locus (14). The fraction of edited WBM 
cells (Fig. 1C) and the subset of edited Lin Scal*c- 
Kit* (LSK) cells within the BM (Fig. 1D) exhib- 
ited a dose dependency (0.1 to 1 ug mRNA) 
when incubated with CD45/LNP-Cre and con- 
trol IgG/LNP-Cre. The majority of LNP-mediated 
transfections occurred within 6 hours (Fig. 1, C 
to F). Targeting rates in the LSK cell subset 
were consistently and significantly greater with 
CD117/LNP-Cre than with CD45/LNP-Cre or 
control IgG/LNP-Cre, which suggests saturation 
of c-Kit* cells by CD117/LNP-Cre at the lowest 
dose tested (Fig. 1, C and D). CD117/LNP-Cre 
showed greater efficacy in LSK cells at lower 
concentrations: Treatment with 0.1 ug CD117/ 
LNP-Cre was 2.5-fold more effective at tar- 
geting LSK cells compared with treatment 
with 0.1 ug CD45/LNP-Cre (Fig. 1D). There 
was no significant difference between targeted 
cell frequency in the LSK cell subset with the 
0.1-ug and 0.5-ug dose or 0.5-ug and 1-ug dose. 
We also replaced the media of cells treated 
for 18 hours with LNP-Cre and kept them for 
an additional 3 days in culture to assess the 
maximum targeting achieved after exposing 
WBM to LNPs. The rate of targeted cells in- 
creased over 3 days without additional LNP 


ANOVA. ****P < 0.0001. (C to G) Assessment of ZsGreen* reporter induction 
after LNP-Cre treatments in Ai6 BM cells triggered by removal of /oxP-flanked 
STOP cassette by Cre. Treatment of [(C), (E), and (G)] BM cells or [(D) and (F)] ‘ 
Lin” BM cells at doses and culture intervals stated in figure. [(D) and (F)] LSK 
cell subset shown when treating Lin” cells. No difference between CD117/LNP-Cre 
editing in Lin” cells treated with 0.1 and 0.5 or 0.5 and 1 ug. In (C) to (G), 
data represent mean + SD of n = 3 replicate experiments. P values are from 
Dunnett's multiple comparison test after two-way ANOVA. Specifically, in (C) to 
(G), **P < 0.01, ***P < 0.001, ****P < 0.0001. 


exposure (Fig. 1G): At a dose of 0.1 ug CD117/ 
LNP-Cre, 88.5% WBM cells were ZsGreen* at 
90 hours versus 43.5% at 18 hours (Fig. 1, E 
and G), which indicates that additional mRNA 
translation, Cre-mediated recombination, and 
ZsGreen* transcription and translation occur- 
red beyond the 18-hour LNP exposure. Nota- 
bly, LNP-Cre treatment had no consistent 
effect on cell viability across formulations, 
regardless of the targeting antibody (fig. S1, B 


« 


to D). Hence, we determined that the use of 


CD117/LNP-Cre was superior to that of CD45/ 
LNP-Cre to modify HSCs and selected CD117- 
LNP-Cre for subsequent experiments. 


Anti-CD117 LNPs edit multipotent and 
self-renewing long-term HSCs ex vivo 


To evaluate multipotency in cells edited by use 
of CD117/LNP-Cre, we transplanted lethally 
irradiated congenic C57BL/6 CD45.1-recipient 
mice with Ail4 BM cells treated ex vivo with 
increasing doses of CD117/LNP-Cre and con- 
trol IgG/LNP-Cre. Because HSCs give rise to all 
blood cell lineages, we followed reporter gene 
expression in peripheral blood cells over time 
and analyzed the BM at the 4-month endpoint 
(Fig. 2). The percentage of CD117/LNP-Cre- 
mediated tdTomato-positive Ail4 erythroid cells 
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in recipient mice increased with time after 
HSCT, which is consistent with the engraftment 
of donor HSC (Fig. 2C). Mice had durable editing 
in all lineages, specifically myeloid cells (Gr1", 


Fig. 2. CD117/LNP-Cre 
treatment ex vivo leads to 
near-complete tdTomato 
gene editing upon transplan- 
tation. (A and B) Percentage 
tdTomato marking in myeloid 
(Gr1*) (A) and lymphoid (CD3", 
left, and B220*, right) (B) cells 
measured at 16 weeks after 
HSCT in lethally irradiated 
congenic CD45.1 recipients who 
received Ail4 BM treated ex 
vivo with 0.1 and 1 ug of control 
IgG/LNP-Cre or CD117/LNP- 
Cre. In (A) and (B), data 
represent mean + SEM of n = 4 
(for IgG/LNP-Cre at 1 ug only) 
or n = 5 experimental animals 
per cohort. P values are from 
Tukey's multiple comparison 
test after one-way ANOVA. In 
(A), ****P < 0.0001. In (B), 

*P < 0.05, ****P < 0.0001. 
(C) Kinetic analysis of erythroid 
editing measured up to 

16 weeks after HSCT. Data 
represent mean + SD of n = 4 
or 5 experimental animals per 
cohort. (D) tdTomato marking 
in the BM and BM subsets: 
c-Kit* (Lin”c-Kit*), LSK, and 
LT-HSCs (SLAM). Data repre- 
sent mean + SEM of n =4or5 
experimental animals per 
cohort [same animals as in (A) 
and (B)]. P values are Tukey's 
multiple comparison test after 
one-way ANOVA. ***P < 0.001, 
****P < 0.0001. (E) CFU assay 
from Ail4 BM treated ex vivo 
with 0.1 ug or 1 wg of control 
IgG/LNP-Cre or CD117/LNP-Cre 
formulations or untreated. 

(F and G) Semiquantitative PCR 
of (F) BM and (G) spleen 
genomic DNA isolated from the 
groups in (A) to (C) at 4 
months after BMT. **271 base 
pair (bp) Cre-recombinase-edi- 
ted genomic DNA (gDNA) 
region and *1142 bp unedited 
region are indicated. 
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Fig. 2A), lymphoid cells (CD3* and B220’, Fig. 2B), 
and erythroid cells (Fig. 2C) at 4 months after 
HSCT, which is consistent with genome editing 


of multipotent HSCs. 
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Fig. 3. CD117/LNP-Cre formula- 
tions lead to >50% tdTomato 
marking in LT-HSCs after in vivo 
injection. (A) Biodistribution of 
intravenous injection of 1 ug of 
targeted LNP-mRNA expression in 
vivo by means of luminescence 
imaging at 24 hours. A representa- 
tive sample set of dissected mouse 
organs were analyzed 5 min after 
the administration of D-luciferin. 

(B to D) tdTomato* cell frequency 
in peripheral blood (B) myeloid 
(Gr1*) and (C) lymphoid cells [CD3* 
(T cells), B220* (B cells)] and in 
(D) BM subsets (c-kit, LSK, 
SLAM/LT-HSCs) at 4 months after 
in vivo treatment with 5 ug of 
CD117/LNP-Cre or control IgG/ 
LNP-Cre. In (B), (C), and (D), data 
represent mean + SEM of n = 5 
experimental animals per cohort. 

P values are reported from paired 

t test. **P < 0.01, ***P < 0.001, 
**P < 0.0001. (E to G) tdTomato* 
cell frequency in peripheral blood 
(E) myeloid, (F) lymphoid cells, and 

(G) BM subsets at 4 months after in E 
vivo treatment with 5 or 1 ug of 
CD117/LNP-Cre. In (E), (F), and (G), 
data represent mean + SEM of n =7 
(1 ug) and n = 5 (5 yg) experimental 
animals per cohort. P values are 
reported from t test. ***P < 0.001, 
****P < 0.0001. (H and I) Edited 
RBC frequency over time in Ai9 
mice treated in vivo with (H) 5 ug of . 
CDI117/LNP-Cre or control IgG/LNP-Cre 
or with (I) 1 or 5 ug of CD117/ 
LNP-Cre. In (H), data represent 
mean + SD of n = 5 experimental 
animal per cohort. P values are 
reported from paired t test. ****P < 
0.0001. In (I), data represent 

mean + SD of n = 7 (1 ug) and 5 
(5 ug) experimental animals per 
cohort. P values are reported from 
t test. ****P < 0.0001. (J) CFU 
assay from BM at 4 months after 
in vivo treatment with 5 ug control 
IgG/LNP-Cre (top), no treatment 
(middle), or 5 wg CD117/LNP-Cre 
(bottom). (K and L) Semiquantitative 
PCR of (K) BM and (L) spleen genomic 
DNA isolated from the animals 

in (B) to (D) at 4 months after BMT. 
**271 bp Cre-recombinase-edited 
gDNA region and *1142 bp unedited 
region are indicated. 
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control IgG/LNP-Cre, respectively (Fig. 2D), 
which was similar to that seen in the WBM, 
the c-Kit*, and the LSK cell subsets. Donor 
chimerism was consistently high among all 
groups (>94% at 4 months) (fig. S2A). The 


A SCD untreated ABE-RNA 


p> lo 


3pg/cell CD117/LNP- 


gene-editing rates of ex vivo-treated BM cells 
were dose dependent (fig. $2, B and C). Red 
blood cell (RBC)- and leukocyte-editing rates 
with CD117/LNP-Cre were =99% at 0.05-, 0.1-, 
and 1-ug mRNA doses and 91.8% at the 0.01-ug 


10pg/cell CD117/LNP- 
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Fig. 4. Base editing of the E6V sickle cell mutation with human CD117 targeted LNP. (A) Representative 


sickle cell disease (SCD) erythroid progenitor lysates unt 
(hCD117)/LNP-NRCH Cas9 ABE-8e mRNA and hCD117/L 


is B°/(B° + B°) *100. (B) Representative images of sickl 


epresent mean + SD of n = 10 high-powered fields (h 
specimens). P values are reported from unpaired t test 
(protein) to base edited allele frequency (DNA). 
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everse-phase (RP) high-performance liquid chromatography (HPLC) chromatograms of in vitro differentiated 


reated (left) and after treatment with anti-human CD117 
P gRNA (middle and right). Base editing yields 


nonpathogenic HBB® (8°), which elutes before pathogenic HBB® (8°) and the a-globin protein (a). Percent shown 


ing of in vitro-differentiated erythroid progenitors 


under hypoxic conditions at the treatments in (A). Arrowheads indicate sickled morphology. Scale bar, 20 um. 
(C) Percentage of sickled cells from unedited and edited (varying mRNA doses) sickling assays. Data 


pf) (unedited specimens) and n = 30 hpf (edited 
.***P < 0.0001. (D) Correlation of %B° by RP-HPLC 
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dose (Fig. 2, A to C, and fig. S2, B to D). By com- 
parison, targeting mediated by control IgG/ 
LNP-Cre was near 0% at 0.01 ug (fig. S2, C and 
D). tdTomato* Gr1* cells had the fastest rise 
(fig S2, B and C), which is expected given their 
rapid turnover of 2 to 3 days. BM cells harvested 
from these animals showed similar editing 
rates in colony-forming assays, a functional as- 
say for clonogenic potential, and thus corrobo- 
rated the flow cytometry results of LT-HSCs 
(Fig. 2E and fig. $2, E and F). At 4 months after 
HSCT, splenocytes had genome-editing levels 
comparable with those in the WBM (Fig. 2F, 
G), which is consistent with the migration of 
edited BM-derived cells to the spleen. To as- 
sess the stem cell potential of ex vivo-edited BM 
cells, we performed secondary transplants using 
the BM from two primary chimeras that were 
recipients of Ail4 BM cells treated ex vivo with 
either CD117/LNP-Cre or control IgG/LNP-Cre 
(0.1-ug dose of mRNA). Editing levels in sec- _ 
ondary chimeras phenocopied those observed 
in the primary transplantation, which included 
sustained editing in the LT-HSC subset and edit- 
ing in multiple hematopoietic lineages (fig. S3). 


In vivo editing of multipotent and self-renewing 
long-term HSCs 


Given the near-complete targeting of LT-HSCs 
ex vivo with CD117/LNP-Cre and our prior abili- 
ty to target lung endothelial and T cells in vivo 
(9-11, 15), we hypothesized that LT-HSCs could 
be targeted in vivo as well. Intravenous adminis- 
tration of CD117/LNP-Luc generated luciferase 
activity in the femur at 24 hours, whereas IgG/ 
LNP-Luc did not (Fig. 3A). Both control IgG/ 
LNP-Luc and CD117/LNP-Luc showed compa- 
rable luciferase activity in the liver because LNPs 
bind apolipoprotein E (ApoE) and are non- 
specifically targeted to the low-density lipo- 
protein (LDL) receptor, which is expressed on 
hepatocytes (8). We tested in vivo multilineage 
editing by quantifying tdTomato expression in 
peripheral blood cells of intravenously IgG/ or 
CD117/LNP-Cre-treated animals over time (up 
to 4 months) and tdTomato expression in BM, 
and specifically the LT-HSCs, at 4 months. At the 
same dose (5 ug), CD117/LNP-Cre-treated mice 
had significantly higher editing in all periph- 
eral blood lineages (Fig. 3, B and C) and three- 
fold more editing in LT-HSCs (55% versus 19%, 
respectively) compared with that observed in 
control IgG/LNP-Cre-treated mice (Fig. 3D). 
HSC editing after in vivo treatment with CD117/ 
LNP-Cre was dose dependent in peripheral 
blood and BM at 16 weeks, with a 5.5-fold in- 
crease in the percentage of gene-edited LT-HSCs 
with 5 versus 1 pg (Fig. 3, E to G). LNP-Cre 
in vivo editing led to the appearance of edited 
RBCs with kinetics similar to that of the trans- 
plantation of ex vivo-treated BM (Fig. 3, H 
and I). At 4 months after treatment with 
CD117/LNP-Cre, marking of HSCs was con- 
firmed with visual inspection of tdTomato* 
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colony-forming units (CFUs) (Fig. 3J and fig. 
S7A), and Cre-mediated genomic deletion in 
the BM and splenic DNAs was confirmed with 
polymerase chain reaction (PCR) (Fig. 3, K and 
L). To further confirm in vivo LT-HSC target- 
ing, we investigated editing of the endothelial 
protein C receptor (EPCR)* LT-HSC SLAM sub- 
population (16), whose self-renewal properties 
are enriched compared with that of the LT-HSC 
SLAM population (77), using the Ai6 model (fig. 
S4). Editing rates in the SLAM LT-HSC popula- 
tion and the EPCR* LT-HSC subpopulation 
were comparable within each cohort (CD117/ 
LNP-Cre and Control IgG/LNP-Cre) (fig. S4E). 
Mice injected with CD117/LNP-Cre had 55% + 
10% edited SLAM LT-HSCs versus 46% + 14% 
edited EPCR* LT-HSCs, whereas mice in the 
control group had 9% + 2.3% edited SLAM 
LT-HSCs versus 8% + 1.9% edited EPCR* 
SLAM LT-HSCs (fig. S4E). CFUs from the BM 
of primary chimeras generated from in vivo- 


A 
100 
2 
oO 
o 
+ 8 75 
Oo rea] 
a rd 
g a 
> ire 
o oO 
é x 
Ww 
oO 
x 0 


treated donors confirmed the editing differ- 
ences between the two cohorts and yielded no 
difference in the number of colonies (fig. S4, 
F to H). To demonstrate that LNP-mediated 
editing targeted bona fide HSCs, chimeras from 
the initial in vivo experiment (Ai9 strain) were 
generated by transplanting irradiated congenic 
(C57BL/6 CD45.1) recipients with BM from mice 
4 months after in vivo treatment with a 5-ug 
dose of CD117 or control IgG/LNP-Cre. Assess- 
ment of the hematopoietic-derived lineages, 
which included LT-HSCs in the BM, in these 
chimeras recapitulated editing found in the 
donor cells (fig. S5). LT-HSC editing in second- 
ary chimeras was 52% for those derived from 
the CD117/LNP-Cre-treated primary and 19% 
for those derived from the control IgG/LNP- 
Cre-treated primary. The absolute count of 
viable LT-HSCs was comparable among cohorts 
in both primary ex vivo transplants and in mice 
injected in vivo (fig S6). 
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Nonhematopoietic targeting after targeted 

LNP treatment 

To quantify nonspecific cellular uptake, we 
compared tdTomato expression levels in lung 
and liver cells 4 months after in vivo treatment 
with a single dose of CD117/LNP-Cre (1- and 
5-ug dose) or control IgG/LNP-Cre (5-ug dose). 
At 5 ug, liver editing was high (76% to 79% of 
cells), and editing was comparable between 
the two treatments (fig. S7B), which is con- 
sistent with known nonspecific ApoE and 
LDL receptor axis-mediated LNP uptake (8). 
In the lung, tdTomato expression mediated by 
CD117/LNP-Cre delivery was significantly higher 
(sevenfold) than that of mice injected with 
control IgG/LNP-Cre (fig. S7C). Editing observed 
in the perfused lung was threefold higher with 
5 ug of CD117/LNP-Cre compared with 1 ug. 
This effect was partly “on-target” editing: Ap- 
proximately 8% of lung cells were c-Kit* and 
~90% of lung c-Kit* cells were edited (fig. S7D). 
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Fig. 5. HSC depletion and transplantation conditioning with CD117/LNP- 
PUMA. (A to D) GFP* granulocytes (A) and RBCs in peripheral blood (B), as well 
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(CD45.1%+CD45.2%). Data represent mean + SD of the same cohorts 
indicated in (A) to (C). One-way ANOVA not significant (P > 0.05). (F) RBC, 


as percentage of GFP* CD45* splenocytes (C) and BM cells (D) of C57BL/6 
CD45.1 chimeras competitively transplanted with indicated proportion of GFP* 
C57BL/6 BM untreated and C57BL/6 (GFP) BM treated with CD117/LNP-PUMA. 
Data represent mean + SD of n = 4 (recipients of a 25:75 ratio of GFP:C5/7BL/ 
6*CD117/LNP-PUMA BM), n = 8 (recipients of a 50:50 ratio of GFP:C57BL/ 
6*CD117/LNP-PUMA BM), and n = 4 (recipients of a 50:50 ratio of GFP:C57BL/6 
untreated BM) experimental animals per cohort. P values calculated by means 
of Dunnett's multiple comparison test after one-way ANOVA. ****P < 0.0001. 

(E) Donor chimerism 4 months post-HSCT. Chimerism calculated as CD45.2%/ 
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(G) granulocyte, and (H) hematopoietic cells of the BM, BM subsets, and spleen 
in recipients conditioned with 0.05 mg/kg CD117/LNP-PUMA and receiving 

0 x 10° GFP* C57BL/6 BM cells at 6.5 days after treatment. Data in (F) to (H) 
represent mean + SD of n = 3 recipient animals. Levels of GFP* granulocytes and 
RBCs in unconditioned controls (n = 2) were nearly undetected (0.06 + 0.03 
and 0.05 + 0.02, respectively) 2 months after BMT. (I) Persistence upon 
secondary transplantation of CD117/LNP-PUMA-conditioned GFP* donor BM in 
ethally irradiated congenic mice. Data represent mean + SD of n = 8 recipient 
animals generated from 3 primary chimeras. 
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Cells collected from the testis were also analyzed 
and did not show significant variations from 
baseline levels in control mice (fig. S7E). Addi- 
tionally, none of 50 offspring sired by male 
mice treated with CD117/LNP-Cre in vivo (n = 4) 
or 39 offspring sired by male mice treated with 
control IgG/LNP-Cre (nm = 3) in vivo expressed 
tdTomato. A complete list of animals evaluated 
is provided in table S3. 


Efficient in vitro editing of primary sickle cell 
disease hematopoietic stem and progenitor cells 
with anti-human CD117 


To assess the feasibility of using this platform 
for therapeutic human genome editing, we 
adapted our targeting to human CD117 and 
used LNPs that contained mRNA and encoded 
a Cas9 adenine base editor (ABE) fusion and 
LNPs that carried a single-guide RNA (sgRNA) 
targeted to the B-globin sickle cell mutation. 
Adenine base editing of the A to G leads to 
conversion of the pathogenic E6V (HBB*) 
mutation to a nonpathogenic E6A variant 
(HBBCM2kassar) (78), We applied this therapeu- 
tic strategy to convert pathogenic sickle hemo- 
globin (HBB*) to nonpathogenic G-Makassar 
hemoglobin (HBB“) on four sickle cell specimens 
from separate donors (fig. S8, A and B). We found 
that a molecular excess of sgRNA to ABE mRNA- 
containing LNPs led to efficient editing with 
the highest rates (88%) at 10 pg/cell dose (Fig. 
4A). This led to a corresponding increase in 
HBB® protein (up to 91.7% of -like globin) 
and HBBS decrease after in vitro erythroid 
differentiation, as well as a nearly complete 
absence of sickled cells upon exposure of the 
erythroblasts to hypoxic conditions (Fig. 4, B 
and C). Editing levels and the increase of HBB@ 
were directly correlated (Fig. 4D). We observed 
that LNP doses from 3 pg/cell up to 10 pg/cell 
did not alter the viability and proliferation 
rate of erythroid progenitor cells in vitro (fig. 
S8, C and D). 


PUMA mRNA depletes HSCs from mouse 
BM in vitro 


The survival of human and mouse HSCs depends 
on the anti-apoptotic gene Mcl-1 (9, 20); thus, 
we sought to test the ability of CD117/LNP to 
deplete BM cells using pro-apoptotic mRNA. 
We tested a variety of pro-apoptotic mRNAs 
that act within this pathway. Among those genes 
tested on mouse C57BL/6 BM cells, treatment 
with PUMA mRNA reduced BM and LSK via- 
bility after 48 hours and 6 days in culture, 
respectively (fig. S9A). To confirm that LNP-PUMA 
mRNA treatment depleted multilineage hema- 
topoietic stem and progenitor cell (HSPCs), 
we performed competitive HSCT in which 
C57BL/6 CD45.2 BM was treated with CD117/ 
LNP-PUMA ex vivo (5 ug) and transplanted at 
equal or increasing ratios against untreated 
green fluorescent protein (GFP*) C57BL/6 
CD45.2 BM cells into lethally irradiated con- 
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genic C57BL/6 CD45.1 recipients (fig. S9C, 
schema). If CD117/LNP-PUMA efficiently de- 
pletes HSCs, mice receiving only CD117/LNP- 
PUMA-treated BM (C57BL/6 CD45.2) would 
experience BM failure from the depletion 
of HSCs, and those receiving competitive BM 
would have an overrepresentation of untreated 
GFP* BM. The results were consistent with our 
expectations: Mice injected with only CD117/ 
LNP-PUMA-treated GFP” BM cells died within 
2 weeks from the HSCT, which indicates that 
HSCs were not viable and did not engraft. 
Mice who received 50 or 75% CD117/LNP-PUMA- 
treated GFP” BM had <0.5% donor GFP” Gri” 
cells or RBCs (Fig. 5, A and B) at 4 months 
(endpoint) versus the expected 50 to 75%. The 
remainder of donor cells (CD45.2) were GFP* 
(untreated) cells. This is consistent with the 
essentially complete depletion of engrafting, 
multilineage HSCs with ex vivo treatment of 
CD117/LNP-PUMA. By comparison, mice in- 
jected with control untreated GFP*/- C57BL/6 
CD45.2 BM at a 1:1 ratio had 25% GFP" cells 
(Fig. 5, A to D). At endpoint, all groups had sim- 
ilar donor chimerism (294% C57BL/6 CD45.2) 
(Fig. 5E). 


HSC depletion with CD117/LNP-PUMA allows for 
BM engraftment 


HSC depletion in vivo was confirmed with 
intravenous injection of CD117/LNP-PUMA 
at 0.05 mg/kg in C57BL/6 mice, which showed 
a 71 and 58% decrease in the frequency of LSK 
cells and LT-HSCs in BM isolates 6 days after 
treatment, respectively (fig. SOB). A 0.05 mg/kg 
mRNA dose was found to be the maximum 
tolerated dose. Animals treated with 0.15 mg/kg 
or more CD117/LNP-PUMA displayed decreased 
activity, elevations in the alanine transaminase/ 
aspartate transaminase (AST/ALT) ratio, venous 
congestion of the lungs and liver, and mortality. 

We tested in vivo CD117/LNP-PUMA HSC 
depletion as conditioning for HSCT. After we 
confirmed that a liver-specific microRNA (miRNA) 
binding site (mir-122) could decrease expres- 
sion in the liver (27) (fig. S10), we incorporated 
liver-specific miRNA binding sites for mir-122 
into the 3’ untranslated region of our PUMA 
mRNA cargo. mir-122 is expressed in vertebrate 
hepatocytes and can decrease the expression of 
transgenes in hepatocytes. C57BL/6 recipients 
received 0.05 mg/kg mRNA CD117/LNP-PUMA- 
miRNA intravenously 7 days before the infusion 
of 10 x 10° GFP* C57BL/6 BM cells. The level of 
engraftment was evaluated after 2 weeks and up 
to 16 weeks (endpoint) and confirmed progres- 
sive increase and stabilization of GFP* Gr1* cells 
and RBCs, as well as hematopoietic cells in the 
spleen (CD45*) and BM (Fig. 5, F to H); 3.8% of 
BM LSK cells were donor. By comparison, C57BL/ 
6-recipient mice not treated with CD117/LNP- 
PUMA conditioning failed to engraft donor cells. 
Secondary transplantation of the cells that 
engrafted with PUMA conditioning pheno- 


copied the donors (Fig. 51). This shows that 
in vivo targeting with CD117/LNP-PUMA ef- 
fectively depleted HSCs, which allowed GFP* 
BM cells to successfully engraft without need 
of chemotherapy or irradiation. These engraft- 
ment rates are consistent with those reported 
to be sufficient for the cure of SCID with healthy 
donor BM (22-24) and may overcome BM 
failure syndromes. 


Concluding Remarks 


Our results suggest that LNPs loaded with di- 
verse mRNA cargos can access HSCs in the 
mouse BM niche in situ with a single systemic 
injection. Delivery efficacy to LT-HSCs in the 
BM niche is greatly increased by the conju- 
gation of a targeting moiety (anti-CD117 anti- 
body). In this work, we showed that LNPs 
loaded with a Cre mRNA cargo can induce 
durable genome editing in LT-HSCs ex vivo 
and in vivo at, or above, the levels reportedly | 
required for the cure of nonmalignant hemato- 
poietic disorders that affect the erythroid lineage 
with allogeneic or autologous gene-modified 
cells (25, 26). This approach was translated to 
primary human cells, for which we were able 
to achieve high rates of therapeutic base edit- 
ing in hematopoietic cells from individuals with 
sickle cell disease. Additionally, we demonstrated 
that a genetic medicine, targeted LNP-mRNA, 
can leverage our understanding of HSC biology 
(Mcl-1 pathway dependence) to effect cellular 
state change in vivo with physiologic effects. 
We used this system to deplete HSCs in vivo 
without the genotoxic conditioning regimens 
that often result in pulmonary, liver, and repro- 
ductive toxicity (20, 27, 28). Although this 
conditioning approach requires additional 
refinement to reduce toxicity, such as modifi- 
cations to restrict LNP tropism and/or further 
limit gene expression in unintended cells, this 
has the capacity to replace current myelo- 
ablation approaches. These findings may poten- 
tially transform gene therapy in two ways. First, 
the cure of monogenic disorders, including non- 
malignant hematopoietic disorders (hemoglobino- 
pathies, congenital anemias or thrombocytopenias, 
and immunodeficiencies) and nonhematopoi- 
etic diseases (cystic fibrosis, metabolic disorders, 
and myopathies) with a simple intravenous in- 
fusion of targeted genetic medicines. Second, 
effecting cell type-specific state changes in vivo 
with minimal risk could allow previously impossi- 
ble manipulations of physiology. Such delivery 
systems may help translate the promise of 
decades of concerted genetic and biomedical 
research to treat a wide array of human diseases. 
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Stereoselective amino acid synthesis by synergistic 
photoredox-pyridoxal radical biocatalysis 


Lei Cheng’, Dian Li’, Binh Khanh Mai’, Zhiyu Bo', Lida Cheng’, Peng Liu2*, Yang Yang">* 


Developing synthetically useful enzymatic reactions that are not known in biochemistry and 

organic chemistry is an important challenge in biocatalysis. Through the synergistic merger of 
photoredox catalysis and pyridoxal 5'-phosphate (PLP) biocatalysis, we developed a pyridoxal radical 
biocatalysis approach to prepare valuable noncanonical amino acids, including those bearing a 
stereochemical dyad or triad, without the need for protecting groups. Using engineered PLP enzymes, 
either enantiomeric product could be produced in a biocatalyst-controlled fashion. Synergistic 
photoredox-pyridoxal radical biocatalysis represents a powerful platform with which to discover 
previously unknown catalytic reactions and to tame radical intermediates for asymmetric catalysis. 


he past decade has witnessed the de- 

velopment of several biocatalytic pro- 

cesses that are not encountered in biology 

(1-6). Drawing inspiration from small- 

molecule catalysis, biocatalysis researchers 
repurposed natural flavin- and nicotinamide- 
dependent enzymes (4, 5) and metalloenzymes 
(2, 3, 7, 8) to catalyze unnatural reactions, par- 
ticularly stereoselective, free radical-mediated 
processes (7-15). However, most unnatural 
biocatalytic reactions are known to synthetic 
chemistry (7-14), and the same transformations 
could also be achieved using small-molecule 
catalysts, albeit with no or lower levels of 
stereocontrol. We envisioned that by merging 
two distinct catalytic cycles (76) involving an 
enzyme and a small-molecule catalyst, we 
would be able to devise activation modes not 
previously accessible in conventional enzymol- 
ogy and synthetic chemistry. 

We initiated a research program using vis- 
ible light photoredox catalysis to unlock the 
potential of pyridoxal 5 -phosphate (PLP) en- 
zymes (17, 18) for stereoselective radical reac- 
tions, thereby providing access to valuable 
noncanonical amino acids (ncAAs) (Fig. 1A). 
Because of the ubiquity of ncAAs in bioactive 
natural products (19), peptide therapeutics 
(20), and functional unnatural proteins (21), 
their efficient stereoselective synthesis is a 
major objective within synthetic chemistry 
(22) and synthetic biology (23). Traditional 
chemical synthesis of ncAAs has relied on the 
tedious installation and removal of amino- 
and carboxylate-protecting groups (22). By 
contrast, PLP enzymes facilitate biochemical 
processes constructing and degrading free 
amino acids with outstanding chemical fidelity, 
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thus underscoring their potential as promising 
biocatalysts for ncAA synthesis without pro- 
tecting group manipulation. As a family, 
PLP-dependent enzymes are structurally and 
functionally diverse (17) and catalyze C—C and 
C-heteroatom bond-forming reactions at the 
a (24, 25), B (26-33), and y (34, 35) positions 
of amino acids through a carbonyl catalysis 
mechanism (Fig. 1A) (7, 36). Although open- 
shell intermediates have been proposed and 
investigated in [4Fe-4S]/SAM- and cobalamin- 
dependent PLP aminomutases (37) and Oo- 
dependent PLP oxidases (38, 39), almost all 
biotechnologically useful PLP enzymes oper- 
ate through classic closed-shell mechanisms 
for substrate activation. 

We postulated that if natural PLP enzymes 
could be reprogrammed to catalyze unnatural 
radical C-C bond formation, they would allow 
us to access a broad spectrum of ncAAs with 
diastereo- and enantiocontrol (Fig. 1, B and C). 
In particular, we were intrigued by the pos- 
sibility of synergistically coupling visible light 
photoredox catalysis (40-43) and PLP bioca- 
talysis (77) to furnish a distinct paradigm for 
pyridoxal radical biocatalysis (Fig. 1D). In this 
catalysis mode, the photocatalytically gener- 
ated free radical intermediate is captured by 
an enzymatically generated covalent inter- 
mediate derived from pyridoxal, thereby en- 
abling stereoselective C-C bond formation in 
an intermolecular fashion. In contrast to pre- 
viously investigated natural (44-46) and un- 
natural (4, 5) photoenzymatic catalysis, the 
present synergistic photoredox-pyridoxal bio- 
catalysis separates photoinduced radical for- 
mation and enzymatic radical interception in 
two discrete catalytic cycles. By not relying on 
the photochemical properties of the cofactor 
of the enzyme, this strategy could create op- 
portunities for the further development of 
radical biocatalysis. Furthermore, by engaging 
reactive catalytic intermediates in PLP enzymes 
that remain out of the reach of mechanistically 
related small-molecule carbonyl catalysis (36), 
this synergistic catalysis will facilitate the dis- 


q 


covery of reactions not previously know ee 
synthetic chemistry. 


Design of synergistic photoredox-pyridoxal 
radical biocatalysis 


We focused our design efforts on PLP enzymes 
capable of functionalizing the B position of 
serine, an abundant amino acid building block, 
and its derivatives, in part because of the 
diverse enzymes in this superfamily, includ- 
ing tryptophan synthases (26, 27), tyrosine 
phenol lyases (28, 29), and O-acetylserine 
sulfhydrylases (30, 31). Initially, visible light 
irradiation of a photoredox catalyst (IV) would 
produce an excited-state photooxidant (IV*) 
(Fig. 1D). Single-electron oxidation of the alkyl- 
trifluoroborate substrate I by IV* would pro- 
duce a carbon-centered radical (VID and the 
reduced photocatalyst V. Concurrent with this 
photoredox catalytic cycle, a B-functionalization, 
PLP-dependent enzyme (VII) such as a trypto- | 
phan synthase would convert serine and other 
B-hydroxy-o-amino acids (II) into an electro- 
philic aminoacrylate (X) through a series of 
established natural intermediates (VIII to X). 
If the photocatalytically generated alkyl radi- 
cal could enter the active site and engage the 
biocatalytically formed aminoacrylate X, it 
would lead to an enzyme-bound azaallyl ra- 
dical (XI) (47), an elusive species in natural 
PLP biochemistry (/7, 18). Subsequent electron 
transfer/proton transfer (ET/PT) or proton- 
coupled electron transfer (PCET) (48, 49) in- 
volving the reduced photocatalyst V would 
furnish an external aldimine XII, which upon 
hydrolysis would release product TI. 

In contrast to traditional PLP biochemistry, 
in pyridoxal radical biocatalysis, the a stereo- 
chemistry of the amino acid product ITI is 
determined by the ET/PT or PCET step. It 
therefore should be possible to access both 
L- and p-amino acids through protein engineer- 
ing (Fig. 1C). If successfully implemented, this 
approach would allow the convergent syn- 
thesis of ncAAs with a well-defined stereo- 
chemical dyad or triad in a single manipulation 
(Fig. 1B), thereby simplifying the diastereo- 
and enantioselective assembly of these targets. 


Development of synergistic 
photoredox-pyridoxal biocatalytic 

C-C coupling 

We commenced this study by evaluating the 
synergistic use of B-functionalization PLP en- 
zymes (26-33) and photoredox catalysts for 
radical B carbofunctionalization of their native 
substrates (see table S1 for additional results). 
Benzyltrifluoroborate salt la was selected as 
the model substrate because of its relatively 
low redox potential (1.09 V versus saturated 
calomel electrode in MeCN) and the enhanced 
stability of the resulting benzyl radical (50). 
Among the PLP enzymes that we evaluated, 
the previously evolved “2B9” variant of the 
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Fig. 1. Synergistic photoredox-pyridoxal radical biocatalysis. (A) Synergistic study, red spheres are aryl or alkyl, blue spheres are H or alkyl, and yellow spheres are 
photoredox and pyridoxal radical biocatalysis. (B) Diastereo- and enantioselective H or Me. (C) Enantiodivergent synthesis of L- and p-amino acids using an orthogonal 
biocatalytic synthesis of ncAAs with up to three contiguous stereogenic centers. Red, set of engineered PLP enzymes. (D) Synergistic photoredox and pyridoxal radical 

blue, and yellow spheres are the generic substituents of the molecule; in the present —_biocatalysis: dual catalytic cycle. Enzyme illustration is made from 5VM5 [PDB ID in (52)}. 


Pyrococcus furiosus tryptophan synthase B sub- | 2A). These tryptophan synthase B-subunit var- 
unit developed by Arnold and Buller (33, 57,52) | iants are particularly powerful for biocatalysis 
showed encouraging activity when combined | (27) because they do not require the presence 
with an appropriate photoredox catalyst (Fig. | of tryptophan synthase o subunit for function 
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(33, 52). Herein, this 2B9 variant is referred 
to as L-PfPLP®. Through a survey of transi- 
tion metal and organic photoredox catalysts 
(Fig. 2B), it was found that organic photoredox 
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1 mol% D-Pf PLP® 
A OH O 1 moti LPF PLP B OH. 0 (L-Pf PLP® E104G) i 
mol% Ph OL 10 mol% RhB Pr aed 
Ph~ “BFK + B on ———_ B OH Ph~ “BFK + B ¢ yee pre ‘OH 
hv (440 nm) hv (440 nm) AH 
NHp 200 mM KPi buffer NHg NHp 200 mM KPi buffer 2 
pH = 6.0 e pH = 7.0 D-3a 
18 2a 50°C, 12h i ie a 50 °C, 12h 
entry deviation from standard conditions yield (e.r.) of 3a entry deviation from standard conditions yield (e.r.) of 3a 
1 none 73% (93:7) 1 none 79% (6:94) 
2 10 mol% Acr*t-Mes instead of RhB 26% (80:20) 2 pH = 6.0 74% (11:89) 
3 1 mol% [Ru(bpy)s]Clp instead of RhB 0% 3 pH =6.5 82% (8:92) 
4 1 mol% [Ru(bpz)s]Cl» instead of RhB 1% (80:20) 4 pH = 8.0 73% (5:95) 
5 1 mol% [Ir(dF(CFg)ppy)a(dtbbpy)|PF, instead of RNB <1% (65:35) 5 (rac)-2a 79% (6:94) 
6 no L-PfPLP® 0% 
7 5 mol% PLP instead of 1 mol% L-Pf PLP® 0% e 
% (o} 
8 no RhB <1% CO» 
9 no hu 0% 
_ 

10 (rac)-2a 74% (92:8) Cc e 6 
11 pH=6.5 77% (86:14) EtN 0 NEty 
12 pH = 7.0 76% (77:23) RhB 
13 pH = 8.0 71% (49:51) (photocatalyst) 


Fig. 2. Discovery and development of synergistic photoredox-pyridoxal radical biocatalysis. (A) Discovery and development of a synergistic photoredox and pyridoxal 
radical biocatalytic reaction with .-PfPLP®. (B) Enantiodivergent pyridoxal radical biocatalysis with o-PFPLP® (.-PfPLP® E104G). Reaction conditions: 1a (1 equiv, 4.0 mM), 

2a (3 equiv, 12.0 mM), 1 mol % PLP enzyme (40 uM), 10 mol % RhB (400 uM), hv (440 nm), and 200 mM KPi buffer at 50°C for 12 hours. Yields are an average of 
three runs. See the supplementary materials for details. e.r. values were determined using Marfey’s analysis (60) (see fig. S1 for details). Active-site illustration of 


L-PFPLP® is made from 5VM5 [PDB ID in (52)]. 


catalysts (42), particularly rhodamine dyes, 
facilitated the reaction with the highest ef- 
ficiency (Fig. 2A, entry 1). Other photocata- 
lysts furnished inferior results (see table S2). 
Using 1.0 mol % L-PfPLP® biocatalyst and 
10 mol % rhodamine B (RhB) under slightly 
acidic conditions (pH 6.0) furnished the C-—C 
coupling product 3a in 73% yield and 93:7 
enantiomeric ratio (e.r.: L-amino acid/p-amino 
acid), favoring the L-amino acid (Fig. 2A, en- 
try 1). Omitting the enzyme catalyst L-P/PLP® 
(entry 6), the photoredox catalyst RhB (entry 8), 
or the light source (entry 9) led to little to no 
product formation, confirming the dual catalytic 
nature of this process (tables S3 to S5). Replac- 
ing the enzyme catalyst with 5 mol % free PLP 
cofactor afforded no product (entry 7), further 
underscoring the synergy between the PLP 
cofactor and the protein scaffold in enabling 
this reactivity. Finally, the biocatalyst was able 
to selectively use a single isomer of racemic pDL- 
serine [(1ac)-2a] to generate enantioenriched 
products with the same yield (based on 1a) 
and enantiopurity (entry 10; see fig. S3). 


Identification of enantiodivergent PLP 
enzymes for ncAA synthesis 


Canonical two-electron PLP biocatalysis usu- 
ally operates under neutral to basic conditions 
(pH 7 to 10) (33, 57), and the reaction enantio- 


selectivity typically does not vary as a function 
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of pH. With 1-PfPLP®, increasing the pH of the 
aqueous buffer from 6.0 to 8.0 resulted in re- 
duced enantioselectivities (Fig. 2A, entry 1 and 
entries 11 to 13; see table S6). Performing this 
reaction at pH 8.0 (entry 13) resulted in the 
formation of almost racemic 3a (49:51 e.r.). 
The unusual pH sensitivity of the present sys- 
tem indicated a potential pH-dependent switch 
of the enantiodetermining mechanism. 

We tested a small library of PLP enzyme var- 
jiants (Fig. 2C; see table S1 for additional results) 
and found that the single mutant E104G of 
L-PfPLP® reversed the enantioselectivity (Fig. 
2B, entry 1). We thus refer to L-P/PLP® E104G 
as p-PfPLP®. Unlike other unnatural biocat- 
alytic processes using heme and flavoenzymes, 
in which enantiopreference reversal has be- 
come common (7, 14), in PLP biochemistry, 
the reversal of « stereochemistry through pro- 
tein engineering is notoriously difficult be- 
cause the a configuration of amino acids in 
traditional PLP enzymology is tightly regu- 
lated by the enantiodetermining protonation 
with a conserved lysine residue (17, 26). As an 
active site residue relatively far from the PLP 
cofactor, E104 facilitates the deprotonation 
of the indole nucleophile in native tryptophan 
synthase biochemistry (Fig. 2B) (57). Our 
studies showed that in the native tryptophan 
synthase activity, both t-P/PLP® and p-P/PLP® 
favored the formation of the same natural 


L-tryptophan (98:2 and 96:4 e.r., respectively; 
see fig. S5 for details), suggesting a different 
enantiodetermining mechanism in the cur- 
rent pyridoxal radical biocatalysis. Further- 
more, biocatalysis using p-PfPLP® was found 
to be largely insensitive to the pH of the me- 
dium, with a small increase in enantioselectiv- 
ity at higher pH (Fig. 2B, entries 1 to 4; see 
table S6). Under optimized conditions (pH 7), 
p-PfPLP* furnished the enantiomeric product 
p-homophenylalanine p-3a in 79% yield and 
6:94 er. Similar to L-P/PLP®, p-Pf/PLP® was 
able to use an excess of pi-serine [(7ac)-2a] 
directly for the production of p-3a with iden- 
tical yield and enantioselectivity through a 
kinetic resolution mechanism (entry 5). Although 
L-PfPLP® accepted p-serine as a substrate with 
a relatively low activity toward L-3a formation, 
p-PfPLP® exhibited almost no activity toward 
p-serine (fig. S3). 


Substrate scope of synergistic 
photobiocatalytic ncAA synthesis 


With a set of enantiodivergent protocols in 
hand, we next examined the substrate scope 
of this dual catalytic process (Fig. 3). Both 
the L- and the p-amino acid-forming enzymes 
L-PfPLP® (Fig. 3A) and p-PfPLP® (Fig. 3B) pro- 
moted the transformations of a diverse array 
of trifluoroborate salts. Benzyltrifluoroborate 
substrates bearing a para- (8b), a meta- (3c), 
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A OH O 1 mol% L-Pf PLPP 


9 
a 10 mol% RhB Oy 
+ TT ‘OH 
Q BFK B OH CO) . hv (440 nm) “ee B 
NH> 


NH, 200 mM KPi buffer» 
1 2a pH = 6.0 L-3 
50°C, 12h L-amino acid 
O Me O fe) fo) fe) 
‘OH OH Me ‘OH OH MeO ‘OH 
NH, NH NH, Me NH, NH, 
L-3a L-3b L-3c L-3d L-3e 
(73 + 5)% yield (62 + 1)% yield (74 + 3)% yield (69 + 6)% yield (62 + 4)% yield 
93:7 e.r. (L-:D-) 98:2 e.r. 88:12 eur. 56:44 e.r. 93:7 e.r. 
ie) re MeO. 
F ‘OH Cl OH MeO,C NC OH ‘OH 
NH, Hp NH, 
L-3f L-3g L-3h L-3i 
(65 + 5)% yield (66 + id yield (23 + 3)% yield (57 + 3)% yield (64 + aN yield 
92:8 e.r. 95:5 e.r. 96:4 e.r. 97:3 e.r. 96:4 e.r. 
COAL, COL, Ql 
O. 
o) OH OH Ayo OH ‘ae OH 
NHp> NH» NHp> NH NH 
L-3k L-31 L-3m L-3n L-30 
(42 + 2)% yield (47 + 5)% yield (6 = 1)% yield (15 + 1)% yield (4 = 1)% yield 
95:5.e.r. 90:10 e.r. 96:4 e.r. 71:29 e.r. 88:12 eur. 
OH O 1 mol% D-Pf PLP ° : 
B . 10 mol% RhB @ Ach 
a H a ol Y ‘OH 
Qo er:k B ie) —O futAad nm) “ee B Y 
NHp »’ 200 mM KPi buffer Nie 
1 2a pH = 7.0 D-3 
50°C, 12h D-amino acid : 
Guu “TO sews 
‘OH ; OH Me - OH 
N I N Me NH NH> 
D-3a D-3b D-3c D-3d D-3e 
(79 + 1)% yield (71 + 1)% yield (62 + 7)% yield (53 + 5)% yield Sig ee eid 
6:94 e.r. (L-:D-) 3:97 e.r. 3:97 e.r. 4:96 e.r. 
AK, AK. eee ‘OH Sever OAL. 6 
Ho NH> NH> : 
D-3f D-3g D-3h D-3i D-3j 
(89 + 6)% yield (56 + 1)% yield (37 + 5)% yield (53 + 1)% yield (79 + 3)% yield 
5:95 e.r. 4.5:95.5 e.r. 6:94 e.r. 4:96 e.r. 1.5:98.5 e.r. 
COLE eu CG@uueu i 
0. 
fH, NH NHo He we 
D-3k D-3! L-3mt D-3n D-30 
(32 + 1)% yield (24 + 3)% yield (4 = 1)% yield (2.1 + 0.1)% yield (8 + 1)% yield 
5:95 e.r. 2:98 e.r. 8:12 er. 5:95 e.r. 32:68 e.r. 


Fig. 3. Substrate scope of enantiodivergent synergistic photoredox-pyridoxal radical biocatalysis. (A) Enantioselective biocatalytic synthesis of L-amino 
acids. (B) Enantioselective biocatalytic synthesis of p-amino acids. Reaction conditions: 1 (1 equiv, 4.0 mM), 2a (3 equiv, 12.0 mM), 1 mol % L-PfPLP® or p-PFPLP® 
(40 uM), 10 mol % RhB (400 uM), hv (440 nm), and 200 mM KPi buffer at 50°C for 12 hours. Yields are an average of three runs. e.r. values were determined using 
Marfey's analysis (60) (see fig. Sl for details). The variation in e.r. values was <1%. tL-amino acid was found to be the major product. 


and an ortho- (3d) methyl substituent on the | substrates afforded higher enantioselectivities | furnished uniformly excellent levels of enan- 
aromatic ring were compatible, furnishing | than meta- (L-3c) and ortho- (L-3d) substi- | tiocontrol (p-3a to p-31). Synthetically use- 
homophenylalanine derivatives in excellent | tuted ones. When p-P/PLP® was applied, para-, | ful halogen substituents such as a fluorine 
yields. With .-P/PLP®, para- (L-3b) substituted | meta-, and ortho-substituted benzylic substrates | (3f) and a chlorine (3 g), as well as sensitive 
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ie) ie) 
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NH NH» NH2 NH» 
4a 4b 4c 4d 4e 
(80 + 2)% yield (75 + 2)% yield (74 + 1)% yield (79 + 1)% yield (83 + 2)% yield 
din= 19°51 (3 ae bc | d= 1024 aes 14e4 drat 
e.r.(major) > 99: 1 e.r.(major) > 99: 1 e.r.(major) > 99:1 e.r.(major) > 99:1 e.r. (major) > 99:1 
e.r.(minor) = 94:6 e.r.(minor) = 97:3 e.r.(minor) = 90 : 10 e.r.(minor) = 98 : 2 e.r. (minor) = 98 : 2 
B 
OH O 1 mol% L-Pf PLPE ot 2 
Brak os 10 mol% RhB co & 
+ Me OH 
Me” B OH hv (440 nm) Ph~ ~Me 
NH» 200 mM KPi buffer NHe 
pH = 7.0 NHp 
(rac)-1 2b 50 °C, 12h 5 (rac)-1p 2b 5a 
1 equiv (stereochemical triad) (84 + 3)% yield 
dr> 2031 
c e.r. (major) > 99:1 
a 
> 
3a, 4a or 5a 
6a-c S L-6a 6b 6c 


Fig. 4. Dual catalytic assembly of adjacent stereocenters. (A) Diastereo- 
and enantioselective biocatalytic synthesis of ncAAs with two contiguous 
stereocenters. (B) Diastereo- and enantioselective biocatalytic synthesis of 
ncAAs with three contiguous stereocenters. Reaction conditions: 1 (1 equiv, 
6.7 mM), 2 (5 equiv. 33.3 mM), 1 mol % .-PfPLP® (67 uM), 10 mol % RhB 
(670 uM), hv (440 nm), and 200 mM KPi buffer at 50°C for 12 hours. Yields 
are an average of three runs. Standard deviations of yields are provided. 


Diastereoselective and e.r. values were determined using Marfey's analysis 

(60) (see fig. S2 for details). The variation in e.r. values was <1%. 

(C) Transformation and determination of relative and absolute stereochemistries 
of ncAA products. Reaction conditions: a. NaOH (2 equiv) and 1:1 THF/H20 

at 0°C to room temperature for 3 hours. For x-ray crystal structures, thermal 
ellipsoids were set at 50% probability; hydrogen atoms are omitted for 

clarity. See the supplementary materials for details. 


functional groups such as a methyl ester (3h) 
and a cyano group (31), were tolerated under 
the current conditions. Additionally, both 
electron-deficient (3f, 3g, 3h, and 3i) and 
electron-rich (3j) benzyltrifluoroborates under- 
went smooth transformations with excellent 
enantioselectivities. Bulkier bicyclic substrates 
with a benzodioxole (3k) and a naphthyl (31) 
core were also accepted by the enzyme. Fur- 
thermore, this dual catalytic process is ame- 
nable to the transformation of nonbenzylic 
radical precursors, albeit in relatively low yields 
(3m to 30). Heteroatom-stabilized (3m), unsta- 
bilized primary (3n), and unstabilized sec- 
ondary (30) alkyltrifluoroborates were all 
viable substrates, furnishing the correspond- 
ing ncAA products in an enantioselective 
fashion. These initial activities constitute a 
starting point for further optimization through 
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protein and photocatalyst engineering. Unex- 
pectedly, using p-P/PLP®, the major enantio- 
mer of 3m was found to be L-3m. 

Without further enzyme engineering, the 
current dual catalytic protocol could be readily 
extended to the formation of challenging con- 
tiguous stereocenters with excellent levels 
of diastereo- and enantioselectivity (Fig. 4). 
L-PfPLP® readily accepted threonine (2b) (51), 
an inexpensive and easily available building 
block, in this dual catalytic process, delivering 
extended isoleucine analogs (4a to 4e) with 
vicinal o and f stereocenters in excellent yields 
and stereoselectivities (Fig. 4A). Enantiocon- 
vergent transformation of racemic secondary 
alkyl radical precursors [(7’ac)-1p] led to ncAAs 
bearing three adjacent stereogenic centers 
(5a) with outstanding diastereo- and enan- 
tiocontrol (Fig. 4B). During the course of this 


stereotriad (5a) formation, recovered substrate 
Ip showed 50:50 e.r. at varying conversions 
of Ip, demonstrating that kinetic resolution 
of (rac)-1p was not involved and confirming 
the enantioconvergent nature of this process 
(fig. $7). Finally, t-PfPLP® was able to use an 
excess of racemic p.-threonine [(7ac)-2b] to 
produce the enantioenriched products with 
almost identical yield and enantiopurity (fig. 
S4). L-PfPLP® accepted allo-threonine with low 
levels of activity and stereoselectivity (see fig. S4 
for further details). p-Pf/PLP® displayed very 
low activities (<1% yield) on threonine. 
Previously devised synthetic routes toward 
stereochemical dyads similar to 4 required 
seven or eight steps starting from com- 
mercially available materials (53-55). In addi- 
tion, these methods relied on the use of a 
chiral auxiliary (53) or proceeded with low 
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Fig. 5. Computational studies on synergistic photoredox-pyridoxal radical 
biocatalysis. (A) Computed energy profile using a theozyme model. Enthalpy 
values are relative to external aldimine species 7. Except those in 7, active 
site residues are omitted for clarity. (B) Comparison of activation barriers 


diastereocontrol (54, 55). We are not aware 
of methods for the stereoselective synthesis 
of stereochemical triad 5. Thus, our synergistic 
catalytic methods represent a synthetically 
valuable advance, granting access to ncAAs 
bearing multiple adjacent stereocenters in a 
single operation. Finally, the relative and ab- 
solute stereochemistries of the C—C coupling 


6c (Fig. 4C). 
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D300 ~~ 


products 3a, 4a, and 5a were ascertained by 
x-ray single-crystal diffraction analysis of their 
respective N-acylated products 6a, 6b, and 


Mechanistic and computational studies 


To study the open-shell nature of this dual 
catalytic process, we performed radical trap- 


Pa 
p300 a 
D300 
P297 P297 P297 

steric 4299 G298 L299 G298 
repulsion 

OHH) = | T105 
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from the theozyme and the free PLP cofactor models. Enthalpy values 
are relative to aminoacrylate 10. (C) Optimized structures of radical 
addition transition states from the theozyme model. Bond distances 
are in angstroms (A). 


ping experiments using 2,2,6,6-tetramethyl- 
1-piperidinyloxy (TEMPO). Under standard 
conditions and in the presence of TEMPO, 
radical trapping product formed in 14% yield 
(figs. S8 and S9). Furthermore, in the model 
reaction (la + 2a — L-3a), side products de- 
rived from the benzyl radical, including dibenzyl 
(~1%) and PhCHO (~5%), were observed by 
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gas chromatography-mass spectrometry an- 
alysis (figs. S6, S10, and S11). Together, these 
results are consistent with the formation of the 
benzyl radical under these reaction conditions. 
Furthermore, ultraviolet-visible spectroscopic 
analysis suggested that unlike previously re- 
ported unnatural photoenzymatic catalysis 
with flavin-dependent ene reductases (10, 12), 
charge transfer complexes between the sub- 
strate and the enzymatic intermediate are 
likely not involved in the present pyridoxal 
radical biocatalysis (fig. S12). 

To further elucidate the reaction mecha- 
nism and the origin of regioselectivity for B 
functionalization, we performed density func- 
tional theory calculations using a theozyme 
model (56) prepared from a prior structure of 
an engineered tryptophan synthase bound to 
serine [PDB 5VM5; (52)] consisting of cat- 
alytically relevant amino acid residues K82, 
D300, and other residues within 3.0 A of the 
PLP cofactor (Fig. 5; see the supplementary 
materials for computational details). Consistent 
with previous computational studies on rela- 
ted PLP-dependent enzymes (57, 58), the con- 
version between the internal aldimine and 
external aldimine 7 requires a low activation 
barrier (fig. S16). From the external aldimine 
7, a deprotonation by a lysine residue K82 
(TS-1) takes place, forming a quinonoid inter- 
mediate 8. The B-hydroxy elimination of the 
quinonoid 8 is facilitated by the acidic aspar- 
tate residue D300 (59), leading to the key 
aminoacrylate species 10 with an activation 
barrier of 17.0 kcal/mol relative to 8. The ad- 
dition of the benzyl radical 9 generated from 
the photoredox catalytic cycle to the B carbon 
of aminoacrylate 10 through TS-3 features a 
low activation barrier, giving rise to the azaallyl 
radical intermediate 11. Mulliken spin density 
calculations showed that in this azaallyl rad- 
ical intermediate, the unpaired electron is 
largely located at the C” atom of the amino acid 
(see fig. S17 for details). This azaallyl radical 
11 then undergoes ET/PT, generating the 
external aldimine 13. ET between azaallyl 
radical 11 and the reduced photocatalyst 
[RhB]*, presumably through a long-range 
ET, is found to be kinetically and thermo- 
dynamically feasible based on Marcus theory 
calculations (figs. $21 to S23). The succeed- 
ing PT step (TS-4) has a relatively low barrier 
of 16.0 kcal/mol. 

Transition state calculations revealed the 
critical role of the protein scaffold in con- 
trolling the regioselectivity during the ben- 
zyl radical (9) addition to aminoacrylate 10 
(Fig. 5, B and C). In the absence of any ac- 
tive site residues, the radical addition to a 
free PLP cofactor-based model system is bare- 
ly selective: Although the addition to the C3 
atom of the pyridine ring (through TS3’) is 
kinetically disfavored because of the disrup- 
tion of aromaticity, the radical additions to 
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the B position (C1, through TS3) and the pyr- 
idoxal aldehyde carbon (C2, through TS3’) of 
the aminoacrylate intermediate (10) have near- 
ly identical activation barriers (Fig. 5B, AAH* = 
-0.2 kcal/mol; see fig. S18 for details). We 
ascribe the lack of site selectivity with the free 
PLP model to a combination of electronic ef- 
fects favoring the more electron-deficient C2 
with a larger LUMO (lowest unoccupied mo- 
lecular orbital) coefficient and steric effects 
favoring the more exposed f position (C1). 
With the theozyme model, radical addition 
to the 8 position (C1) has a lower barrier than 
radical addition to C2 (AAH* = 3.2 kcal/mol). 
This result indicates that the enzyme envi- 
ronment plays an essential role in imposing 
site selectivity over the radical addition step. 
Within the enzyme active site, the radical ad- 
dition to the pyridoxal carbon C2 is disfa- 
vored because of the presence of glycine 298 
and lysine 98, which block both faces of C2 
from the benzyl radical attack (Fig. 5C). Our 
density functional theory calculations thus 
underscore the importance of the enzyme 
scaffold in ensuring site selectivities during 
pyridoxal radical catalysis. 


REFERENCES AND NOTES 


1. K. Chen, F. H. Arnold, Nat. Catal. 3, 203-213 
(2020). 

2. Y. Yang, F. H. Arnold, Acc. Chem. Res. 54, 1209-1225 
(2021). 

3. 0. F. Brandenberg, R. Fasan, F. H. Arnold, Curr. Opin. 
Biotechnol. 47, 102-111 (2017). 

4. B.A. Sandoval, T. K. Hyster, Curr. Opin. Chem. Biol. 55, 45-51 
(2020). 

5. W. Harrison, X. Huang, H. Zhao, Acc. Chem. Res. 55, 
1087-1096 (2022). 

6. C. Klaus, S.C. Hammer, Trends Chem. 4, 363-366 
(2022). 

7. Q. Zhou, M. Chin, Y. Fu, P. Liu, Y. Yang, Science 374, 
1612-1616 (2021). 

8. J. Rui et al., Science 376, 869-874 (2022). 

9. M.A. Emmanuel, N. R. Greenberg, D. G. Oblinsky, T. K. Hyster, 
Nature 540, 414-417 (2016). 

0. K. F. Biegasiewicz et al., Science 364, 1166-1169 
(2019). 

1. X. Huang et al., Nature 584, 69-74 (2020). 

2. C. G. Page et al., J. Am. Chem. Soc. 143, 97-102 

(2021). 

X. Huang et al., Nat. Catal. 5, 586-593 (2022). 

Y. Ye et al., Nat. Chem. 15, 206-212 (2023). 

H. Fu et al., Nature 610, 302-307 (2022). 

A. E. Allen, D. W. C. Macmillan, Chem. Sci. 2012, 633-658 

(2012). 

7. A.C. Eliot, J. F. Kirsch, Annu. Rev. Biochem. 73, 383-415 
(2004). 

8. Y.-L. Du, K. S. Ryan, Nat. Prod. Rep. 36, 430-457 

(2019). 

J. B. Hedges, K. S. Ryan, Chem. Rev. 120, 3161-3209 

(2020). 

20. Y. Ding et al., Amino Acids 52, 1207-1226 (2020). 

21. A. Dumas, L. Lercher, C. D. Spicer, B. G. Davis, Chem. Sci. 6, 
50-69 (2015). 

22. C. Najera, J. M. Sansano, Chem. Rev. 107, 4584-4671 
(2007). 

23. P. J. Almbjell, C. E. Boville, F. H. Arnold, Chem. Soc. Rev. 47, 
8980-8997 (2018). 

24. K. Fesko, G. A. Strohmeier, R. Breinbauer, Appl. Microbiol. 

Biotechnol. 99, 9651-9661 (2015). 

K. Fesko, M. Uhl, J. Steinreiber, K. Gruber, H. Griengl, 

Angew. Chem. Int. Ed. 49, 121-124 (2010). 

26. R. S. Phillips, Tetrahedron Asymmetry 15, 2787-2792 
(2004). 


Daw w 


so 


2 


a 


27. E. Watkins-Dulaney, S. Straathof, F. Arnold, ChemBioChem 22, 
5-16 (2021). 

28. D. Milié et al., J. Am. Chem. Soc. 133, 16468-16476 
(2011). 

29. L. Martinez-Montero, J. H. Schrittwieser, W. Kroutil, Top. Catal. 
62, 1208-1217 (2019). 

30. T. H. P. Maier, Nat. Biotechnol. 21, 422-427 (2003). 

31. W. M. Rabeh, P. F. Cook, J. Biol. Chem. 279, 26803-26806 
(2004). 

32. R. J. M. Goss, P. L. A. Newill, Chem. Commun. 47, 4924-4925 

(2006). 

33. A. R. Buller et al., Proc. Natl. Acad. Sci. U. S. A. 112, 

4599-14604 (2015). 

34. Y. Hai, M. Chen, A. Huang, Y. Tang, J. Am. Chem. Soc. 142, 

9668-19677 (2020). 

35. M. Chen, C.-T. Liu, Y. Tang, J. Am. Chem. Soc. 142, 

0506-10515 (2020). 

36. Q. Wang, Q. Gu, S.-L. You, Angew. Chem. Int. Ed. 58, 

6818-6825 (2019). 

37. P. A. Frey, G. H. Reed, Biochim. Biophys. Acta 1814, 1548-1557 

(2011). 

38. E. R. Hoffarth, K. W. Rothchild, K. S. Ryan, FEBS J. 287, 

403-1428 (2020). 

39. E. R. Hoffarth et al., Proc. Natl. Acad. Sci. U. S. A. 118, 

€2012591118 (2021). 

40. J. M. R. Narayanam, C. R. J. Stephenson, Chem. Soc. Rev. 40, 

02-113 (2011). 

Al. C. K. Prier, D. A. Rankic, D. W. C. MacMillan, Chem. Rev. 113, 
5322-5363 (2013). 

42. N. A. Romero, D. A. Nicewicz, Chem. Rev. 116, 10075-10166 
(2016). 

43. K. L. Skubi, T. R. Blum, T. P. Yoon, Chem. Rev. 116, 

0035-10074 (2016). 

44. M. Zhang, L. Wang, S. Shu, A. Sancar, D. Zhong, Science 354, 

209-213 (2016). 

45. D. Sorigué et al., Science 357, 903-907 (2017). 

46. S. Zhang et al., Nature 574, 722-725 (2019). 

47. S. Tang, X. Zhang, J. Sun, D. Niu, J. J. Chruma, Chem. Rev. 118, 

0393-10457 (2018). 

48. D. R. Weinberg et al., Chem. Rev. 112, 4016-4093 


(2012). s 
49. R. G. Agarwal et al., Chem. Rev. 122, 1-49 (2022). 
50. J.C. Tellis, D. N. Primer, G. A. Molander, Science 345, 433-436 
(2014). 
51. M. Herger et al., J. Am. Chem. Soc. 138, 8388-8391 
(2016). 
52. A. R. Buller et al., J. Am. Chem. Soc. 140, 7256-7266 
(2018). 
53. G. R. Pettit et al., J. Nat. Prod. 78, 476-485 
(2015). 
54. J. Kimura et al., J. Org. Chem. 67, 1760-1767 
(2002). 
55. C. J. Easton, M. C. Merrett, Tetrahedron 53, 1151-1156 
(1997). 6 


56. D. J. Tantillo, J. Chen, K. N. Houk, Curr. Opin. Chem. Biol. 2, ¢ 
743-750 (1998). 

57. J. Xu et al., Commun. Biol. 3, 455 (2020). 

58. L. Yang et al., Angew. Chem. Int. Ed. 61, e202212555 

(2022). 

59. X. Sheng, F. Himo, J. Am. Chem. Soc. 141, 11230-11238 : 
(2019). 

60. R. Bhushan, H. Briickner, Amino Acids 27, 231-247 (2004). 


ACKNOWLEDGMENTS 

We thank Y. Hai (University of California Santa Barbara, UCSB), 
L. Zhang (UCSB), Y. Wang (University of Pittsburgh), and B. Wang 
(Xiamen University) for helpful discussions. Funding: This work 
was supported by start-up funds from UCSB (Y.Y.), the Herman 
Frasch Foundation (grant 947-HF22 to Y.Y.), and the National 
nstitutes of Health (grant R35GM128779 to P.L.). We acknowledge 
he BioPACIFIC MIP (NSF Materials Innovation Platform, DMR- 
933487) at UCSB for access to instrumentation. Calculations 
were performed at the Center for Research Computing at the 
University of Pittsburgh and the Extreme Science and Engineering 
Discovery Environment. Author contributions: Y.Y. designed 

and directed the overall research. L.C. discovered and optimized 
he synergistic photoredox-biocatalytic process. L.C. performed 
protein engineering. D.L. synthesized the trifluoroborate 
substrates and racemic amino acid products. L.C., Z.B., and L.-D.C. 
performed the substrate scope study and characterized all of 

he substrates and products. B.K.M. performed the computational 
studies with guidance from P.L. Y.Y. wrote the manuscript with 
input from all other authors. Competing interests: A provisional 


7ofs8 


RESEARCH | RESEARCH ARTICLE 


patent application (US provisional patent number 63/437,491) has 
been filed through UCSB based on the results presented herein. 
Data and materials availability: All data are available in the 
main text or the supplementary materials. Solid-state structures of 
6a, 6b, and 6c are available from the Cambridge Crystallographic 
Data Centre under reference numbers CCDC 2220221, 2220222, 
and 2220224, respectively. Plasmids encoding engineered PLP 
enzymes are available from Y.Y. under a material transfer 
agreement with UCSB. License information: Copyright © 2023 


Cheng et al., Science 381, 444-451 (2023) 


28 July 2023 


the authors, some rights reserved; exclusive licensee American 
Association for the Advancement of Science. No claim to original 
US government works. https://www.science.org/about/science- 
licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 


science.org/doi/10.1126/science.adg2420 
Materials and Methods 


Figs. Sl to S24 

Tables S1 to S8 

References (61-125) 

Data S1 and S2 

MDAR Reproducibility Checklist 


Submitted 11 December 2022; accepted 20 June 2023 
10.1126/science.adg2420 


8 of 8 


RESEARCH 


TECHNICAL COMMENT 


PALEONTOLOGY 


Comment on “Ultrastructure reveals 
ancestral vertebrate pharyngeal skeleton 


in yunnanozoans” 


Kaiyue Het, Jianni Liu’;*, Jian Han’, Qiang Ou?, Ailin Chen, Zhifei Zhang’, Dongjing Fu’, 


Hong Hua’, Xingliang Zhang’, Degan Shu? 


Tian et al. (Research Articles, 8 July 2022, abm2708) hypothesized that yunnanozoans are stem-group 
vertebrates on the basis of “cellular cartilage’, “fibrillin microfibers”, and “subchordal rod” associated 

with the branchial arches of yunnanozoans. However, we reject the presence of cellular cartilage, fibrillin, 
and the phylogenetic proposal of vertebrate affinities based on ultrastructure and morphology of 


yunnanozoans from more than 8000 specimens. 


ian et al. proposed the existence of “cel- 

lular chambers” in the “bamboo-like” 

brachial bar of yannanzoans. However, 

we argue that the putative cellular cham- 

bers described from the branchial bars 
are artefacts of discoid structures in the bars. 
As has been well-documented (1-3), each 
branchial arch, presented as a hollow struc- 
ture (2), bears two bilateral rows of discoid 
structures and an equal number of filaments 
along each bar. The discoid structures, exhibit- 
ing topography of prominent bulges/pits and 
dark-colored circular/oval outline with only a 
small amount of sediment infill (Fig. 1, A to C), 
are clearly distanced from each other (1) 
(Fig. 1E), and primarily never stacked on top 
of each other on the bar axis. Both light mi- 
croscopy (Fig. 1, E and F) and x-ray computed 
microtomography (micro-CT) (Fig. 1D) of 
yunnanozoans reveal that each discoid struc- 
ture is directly connected to the proximal end 
of a blade-like gill filament (Fig. 1, B and C, 
figure 3 in (4), and figure 1I in (7); such a one- 
to-one connection, in total 20 to 25 pairs on a 
bar, demonstrates that the discoid structures 
are substantially the proximal end of the fila- 
ments, not tightly stacked chondrocytes vary- 
ing in numbers. 

Two to four “cellular chambers” delimited 
by “secondary septa” [figure 1E in (5)] are in 
fact vague imprints of laterally compressed 
discoid structures remaining on the gill bar 


1Shaanxi Key Laboratory of Early Life and Environments, 
State Key Laboratory of Continental Dynamics, Department 
of Geology, Northwest University (NWU), Xi'an 710069, P. R. 
China. “Early Life Evolution Laboratory, State Key Laboratory 
of Biogeology and Environmental Geology, School of Earth 
Sciences and Resources, China University of Geosciences, 
Beijing 100083, P. R. China. "Research Center of 
Paleobiology, Yuxi Normal University, Yuxi, Yunnan 653100, 
P. R. China. 

*Corresponding author. Email: eliljn@nwu.edu.cn (J.L); elihanj@ 
nwu.edu.cn (J.H.) 
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(Fig. 1, B to D). Both the upper and the lower 
“cellular chambers” are part of exposed areas 
in-between two laterally arranged discoid 
structures along the gill axis. In particular, 
the discoid structures themselves, due to their 
prominent 3D topology, were erroneously 
treated as a “cellular chamber” by Tian e¢ al. 
[figure 1E in (5)]. Hence neither the discoid 
structures, nor the “cellular chamber” are 
single-cell structures, and they are definitely 
not comparable in shape, structure, amounts, 
and position with tightly stacked chondro- 
cytes of embryonic vertebrates (5 to 10 um in 
diameter) (6), which diagnostically appear in 
groups of two or four (cell “nests”) (7, 8). 
Scanning electron microscopy (SEM) of 
branchial arches reveals a desert of clay mi- 
nerals dominated by illites, but no sign of 
filamentous carbon microfibrils. The co- 
occurring thin carbon films on gills, as a 
major remnant of soft tissues, are amorphous 
and homogeneous by transmission electron 
microscopy (TEM) observation (Fig. 1G), but 
such carbon microfibrils were not found in 
the branchial arches. The Raman spectra of 
carbon film [figure S3 in (5)], shows no dif- 
ferences with peak shifts of samples from 
early Cambrian strata in south China (9), 
which were well supported as kerogen pro- 
duced in a high degree of organic maturing 
and asphaltization, associated with enrich- 
ing of carbon and releasing of oxygen, ni- 
trogen, sulfur and hydrogenium (10). Hence 
the primary composition and 3D structure 
of cartilage and fibrillin cannot survive from 
Cambrian to present. The fossilized '‘fibrillin' 
also cannot be nanoscale in resolution, in 
relation to the host micron-scale mineral 
aggregates, regardless of the presence of 
filamentous Ferrum and Aluminum- Silicon 
clay minerals (17). Thus the claim of the pres- 
ence of nano-scale wavy, parallel “fibrillin” 
with a beaded structure and cross-linkages in 


q 


the gills (5) conflict with a half-billion y Chee 


upd 


of diagenesis of yannanozoans and ot.-~ 


ie 


Chengjiang fossils. The parallel “fibrillin” 
showing nanoscale protrusions, filaments, and 
beaded structures, most likely, reflect cross 
sections of multi-layered structures of some 
clay mineral or filamentous halloysite (72). 
In summary, the claims made by Tian et ai. 
(5) of “cellular cartilages, fibrillin, and sub- 
chordal rod” are not supported by current 
micro- or nano-scale evidence; these labile tis- 
sues, at the very least, if present in yunnano- 
zoans, would have substantially died out due 
to diagenesis. The cartilage-like tissues are 
present in protostomes, (i.e., horseshoe crabs 
(Merostomata), cephalopod molluscs (Cepha- 
lopoda), and sabellid polychaete worms (An- 
nelida) (8), thus can hardly contribute to 
paleontological classification. By contrast, var- 
ious iconic macroscopic organ-level traits of 
chordates, such as a notochord, myomeres in 
a zigzag shape, post-anal tail, and the dorsal 
and ventral fins, remain highly controversial 
in yunannozoans (2). The well accepted traits 
of vertebrates, especially brain, camera-like 
eyes, paired nasal sacs, and vertebra, which 
are exceptionally preserved in undisputed 
Cambrian vertebrates such as Haikouichthys 
(73) and Metaspriggina (74), are absolutely 
absent in yunnanozoans. Additionally, no cel- 
lular structure could be found in specimens of 
gill cartilages and vertebrae of Haikouichthys 
using SEM and Micro-CT. Likewise, only ro- 
bust macroscopic organs and characteristics, 
represent the most reliable standards for iden- 
tifying animal phyla in Cambrian paleontology. 
Therefore, we propose that yunnanozoans are 
not stem-group vertebrates based on the avail- 
able evidence; they are at most considered as a 
sister group to either chordates (Fig. 2 for 
database and other supplementary materials 
please see https://doi.org/10.6084/m9.figshare. 
c.6492880.v1) or vetulicolians (15). 
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Fig. 1. Branchial arches and fila- 
ments with ultrastructures of 
Haikouella from Jianshan and 
Ercaicun sections, Haikou, 
Kunming. (A to C) An overview of 
branchial arches in specimen 
ELI-JS639 (A). Enlargement (B) and 
line drawing (C) of boxed area in 
(A) show detailed anatomy of each 
arch. (D) Micro-CT image of branchial 
arches in ELI-JS5147A. (B) to (D) 
One-to-one connection between a 
gill filament and a discoid structure. 
(E) Linear arrangement of discoid 
structures clearly distanced 

(black arrows) in ELI-JS1320A. 

(F) Hollow morphology of gill filaments 
(white arrows) and gill axis in 
ELI-JS292. (G) Transmission 
electron image showing that homoge- 
neous organic matter (between 

two dash lines) and clay minerals 
(outside of two dash lines) are 
present in branchial arch of 
ELI-JS172B. (H) Reconstruction of 
branchial arches. Abbreviations: 

ds, discoid structures; g1-6, gill 1 to 
6; gf, gill filaments; gr, gill raker; 
vbv, ventral blood vessel. Scale 
bars: 1 mm in (A); 200 um in (B), 
(E), and (F); 100 um in (D); 100 nm 
in (G). 
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WORKING LIFE 


By Abraham Stijn Meijnikman 


454 


When work gets personal 


t 9 years old, I sat in a hospital waiting room after a severe hypoglycemic episode—a sharp drop 
in blood sugar. I was devastated: The episode had derailed my hopes of playing for a national 
league soccer team, and for the first time I truly understood the profound impact my type 1 diabe- 
tes would have on my dreams for the future. The pediatrician who treated me must have noticed 
because he said to me, “Listen, kid, you have diabetes, but you are not your diagnosis! You can do 
anything as long as you put your heart into it.” Those words left a mark, sparking an ambition to 
become a doctor and help people with diabetes. It turned out to be a tough path to follow. 


I was ecstatic when, years later, I 
was accepted to medical school. I 
focused much of my clinical work 
on diabetes. During my second year, 
I also began to conduct research on 
the disease. The work felt mean- 
ingful because I was finally in the 
trenches working to help people. 

But in time, I began to feel over- 
whelmed. The constant reminders 
of diabetes, the scent of insulin, 
and the daily struggle with my 
own disease—where every meal 
had to be carefully weighed and 
calculated—made me feel as if my 
entire world revolved around dia- 
betes. I saw major complications 
in patients who diligently followed 
their treatment, reminding me of 
the seriousness of the disease and 
how hard it can be to control. And 
in my personal life, I watched my 
father, who had also been diagnosed with type 1 diabetes, 
struggle with one complication after another. 

Around that time, I attended a conference where one of 
the plenary speakers discussed the severe brain impacts of 
hypoglycemia, which is often hard for patients with diabe- 
tes to avoid completely. The data were convincing, which 
made me ponder whether it is even possible to live with 
diabetes and avoid long-term health issues. My situation 
and that of my father made the risks feel very personal. At 
that moment, I reached a breaking point. 

For the sake of my mental health, I decided to take a 
step back from diabetes professionally. I took a position 
as a Ph.D. student in a lab that studies liver disease. The 
university was located closer to my family in the Neth- 
erlands, which was important because my father’s health 
was declining. I also developed a deep passion for liver 
research, which allowed me to continue helping people 
with diabetes—many of whom have liver disease—at more 
of a distance. 


“| want to study diabetes, 
but | don’t want 
to be overwhelmed by it.” 


I continued to tackle that kind 
of research after I graduated, stay- 
ing on in the same lab as a postdoc 
while also completing a residency 
in gastroenterology. After a while, 
though, I started to feel a pull back 
toward diabetes. 

My father passed away because 
of complications of the disease. I 
also had my own health scare. After 
finishing a grueling 12-hour shift in 
the hospital, I rushed to a confer- 
ence to give a presentation, only to 
have my continuous glucose sensor 
fall off, leaving me unable to moni- 
tor my blood sugar levels. Later, as 
I sat on the tram, I could tell that 
my blood sugar was plummeting. I 
hurried to a cafeteria to get a sug- 
ary snack only to find it was closed. 
I tried the vending machine, but it 
was filled with vegetables instead of 
the usual candy and sodas—a change intended, ironically, 
to reduce metabolic diseases such as type 2 diabetes. That’s 
the last thing I remember from that evening. 

I woke up in the emergency room, disoriented and un- 
able to recall where I was, what day it was, or even who I 
was. I learned from the doctors that I was lucky that some- 
one brought me to the hospital. In that vulnerable state, I 
realized I couldn’t completely turn my back on my diabetes. 

That day, I fully embraced my destiny. Diabetes was a 
part of me, and I could use my unique perspective to con- 
tribute to the field. 

I recently transitioned to a new postdoc position that al- 
lows me to work on a diverse slate of projects, some focused 
on diabetes and others on liver disease. I’m hopeful that 
this situation will be more sustainable for me. I want to 
study diabetes, but I don’t want to be overwhelmed by it. 


Abraham Stijn Meijnikman is a postdoctoral fellow at the University of 
California, San Diego. Send your career story to SciCareerEditor@aaas.org. 
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new products 
\ ==? est design is built with a refreshed, intuitive user 
interface and flexible connectivity features for 
streamlining protocol management. A motorized lid allows for auto- 
mation compatibility and modern instrument control for workflows. 
The PTC Tempo Thermal Cycler is engineered to maintain superior 
thermal performance, delivering accurate results alongside the flex- 
ibility to expand and grow within the laboratory. These PCR instru- 
ments offer enhanced usability with features designed to support 
academic, commercial, and biopharma labs in conducting basic and 
translational research, process development, and quality control. 
Bio-Rad Laboratories 
For info: 1-800-424-6723 
bio-rad.com/ptc-tempo 


PCR Thermal Cyclers 

The PTC Tempo 96 and PTC Tempo Deepwell 
Thermal Cyclers are a new generation of conven- 
tional PCR thermal cyclers from Bio-Rad. The lat- 


Needle-Free Transfer Urine Collection System 

SARSTEDT's Needle-Free Transfer Urine Collection System is a closed 
urine collection system for transfer of urine from the needle-free cup 
or urine collection container to a Urine Monovette. This needle-free 
cup and collection container feature an integrated transfer unit with a 
pierceable membrane. Following collection using our Urine Monovette, 
the membrane immediately reseals ensuring hygienic and repeat- 

able sample collection. Additionally, the Needle-Free collection system 
minimizes the risk of injury and increases user safety as there is no risk 
of accidental needle sticks. For disposal, the NFT urine collection system 
does not require a sharps container, making it an economical choice. 
SARSTEDT 

For info: 1-800-257-510 

www.shop.sarstedt.us/nft 


Matrix Hydrogel 

AMS Biotechnology (AMSBIO) announces the launch of Extragel, a like- 
for-like replacement for Matrigel™, Geltrex™, and Cultrex™ Basement 
Membrane Extract (BME). Available ex-stock, with overnight shipment, 
Extragel is a reconstituted matrix hydrogel formed from basement 
membrane components extracted from mouse tumor tissues, which 
are rich in extracellular matrix proteins. Designed to be used undiluted 
or diluted to a specific protein concentration, Extragel is proven to 

be a highly effective growth scaffold for in vivo xenograft generation, 
pluripotent stem cell maintenance and differentiation, angiogenesis, 
and advanced 3D culture applications. Extragel is composed of laminin, 
collagen IV, and heparan sulfate proteoglycans plus various growth fac- 
tors including epidermal growth factor, platelet-derived growth factor, 
nerve growth factor, basic fibroblast growth factor (FGF-2), transform- 
ing growth factor-B, and insulin-like growth factor. 

AMS Biotechnology (AMSBIO) 

For info: +1-800-987-0985 
www.amsbio.com/3d-cell-culture-extracellular-matrices/extragel 


Sequencing System 

PacBio, a leading developer of high-quality, highly accurate sequencing 
solutions, today announced the Revio™ long-read sequencing system, 
which will enable customers to dramatically scale their use of PacBio’s 
celebrated HiFi sequencing technology. Revio is designed to provide 
customers with the ability to sequence up to 1,300 human whole 
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genomes per year at 30-fold coverage for less than $1,000 per genc; 
With this scale and pricing, PacBio believes Revio will enable the use of 
HiFi sequencing for large studies in human genetics, cancer research, 
agricultural genomics, and more. Revio will be PacBio's first system to 
feature state-of-the-art NVIDIA GPUs, providing a 20-fold increase in 
computing power compared to the Sequel Ile. In addition to providing 
accelerated basecalling to meet Revio's higher throughput, the Al- 
enabled computer will integrate deep learning algorithms to detect DNA 
methylation from standard sequencing libraries, and DeepConsensus, 

a deep learning method developed with Google Health to improve the 
yield and accuracy of HiFi sequencing. 

PacBio 

For info: 1-877-920-7222 

www.pacb.com 


Transgenic HLA Mice 

Transgenic HLA mice have a long history in autoimmune disease, 
infectious disease, and vaccine research. As with so many immunology- 
related research tools, these well-established models are finding new 
applications in immuno-oncology. HLA proteins (called major histo- 
compatibility complex or MHC in non-human species) process and pre- 
sent antigens as part of the T cell-mediated adaptive immune response. 
The MHC system varies greatly between humans and preclinical test 
species—even primates—so transgenic mice that express human HLA 
transgenes are an ideal model for this aspect of human immune re- 
sponse. Immuno-oncology scientists are now using transgenic HLA mice 
to study cancer vaccines. They can be used to assess immunogenicity of 
cancer vaccine epitopes and in anti-tumor efficacy experiments. To sup- 
port use in syngeneic tumor experiments with common C57BL/6 tumor 
cell lines, Taconic has launched a portfolio of transgenic HLA mice on 
the C57BL/6 inbred strain background, which complements the hybrid 
strain versions previously available. 

Taconic : 
For info: +1-888-822-6642 

www.taconic.com/ 


Compact Detection Devices for Biomolecular Applications 

The biocompatible flow path and minimal heat output of Runge Mikron 
detectors from Biotech Fluidics makes them perfect for analysis and 
preparative purification of biomolecules, especially in refrigerated 
environments. Based upon fixed wavelength LED light sources with a 
lifetime of more than 5,000 hours, power consumption below 2.5 watts, 
and start-up within seconds, Runge Mikron detectors are ideal for incor- 
poration into portable field instruments and online monitoring devices. 
Runge Mikron detectors are available for photometric, fluorimetric, and 
conductivity measurements. Measuring only 80 mm-150 mm long, each 
modular detector light source, filter, and measuring cell can be adapted 
to provide an optimized solution for your application. Different modules 
can also be combined to provide a fluidic monitoring system for more 
challenging biomolecular applications. Typically, Runge Mikron detec- 
tors are more affordable and compact than a fully variable detector. 
Runge Mikron detectors are easy to connect to almost any instrument 
as they communicate through and draw power from a single USB-C 
port. Drivers are provided for a growing range of laboratory control and 
chromatography software. Alternatively, an open protocol can be used 
for customized implementation. 

Biotech Fluidics 

For info: + 1-612-703-5718 
www.biotechfluidics.com/products/detectors/mikron 


Electronically submit your new product description or product literature information! Go to www.science.org/about/new-products-section for more information. 
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